


The Manifold Experiments of Arthur Square and Spherius

by Jurgan



Category: Flatland - Edwin A. Abbott
Language: English
Status: In-Progress
Published: 2019-11-24
Updated: 2020-04-07
Packaged: 2021-02-26 07:00:35
Rating: General Audiences
Warnings: Creator Chose Not To Use Archive Warnings
Chapters: 17
Words: 33,424
Publisher: archiveofourown.org
Story URL: https://archiveofourown.org/works/21539398
Author URL: https://archiveofourown.org/users/Jurgan/pseuds/Jurgan
Summary: Thank you for coming along with me on this weird experiment! What you should expect is an attempt to understand differential geometry and manifolds through the characters of Abbott’s classic novella. I have been struggling through Manfredo Do Carmo’s textbook Riemannian Geometry, and it suddenly occurred to me that there was already a model for exploring the abstraction of other dimensions. So I will be translating the main theorems and definitions of Do Carmo’s book into this new context. I will try to keep the explanations clear, but some basic understanding of calculus and linear algebra would be helpful. This will be more of a thought experiment than a traditional narrative (my main goal is to boost my own understanding), but my professor says Do Carmo’s book tells a story, so let’s see how it comes together. Any questions or suggestions are more than welcome!
Comments: 2
Kudos: 5





	1. In Which a Sphere Recruits a Square

**Author's Note:**

> Thank you for coming along with me on this weird experiment! What you should expect is an attempt to understand differential geometry and manifolds through the characters of Abbott’s classic novella. I have been struggling through Manfredo Do Carmo’s textbook Riemannian Geometry, and it suddenly occurred to me that there was already a model for exploring the abstraction of other dimensions. So I will be translating the main theorems and definitions of Do Carmo’s book into this new context. I will try to keep the explanations clear, but some basic understanding of calculus and linear algebra would be helpful. This will be more of a thought experiment than a traditional narrative (my main goal is to boost my own understanding), but my professor says Do Carmo’s book tells a story, so let’s see how it comes together. Any questions or suggestions are more than welcome!

The sphere was in over his head.

  
The sphere was one of the top research scientists in Spaceland, and everyone knew it. He had explored every corner of their world (metaphorically, of course- Spaceland was infinite and open in every direction) and thought he knew what was to come. But several days ago they had discovered something none of their experiments prepared them for.

When he looked at the photographs, they were unremarkable. Various shapes in three dimensions- a cube, a truncated tetrahedron, a pair of frustrums with a cube connecting them. But when he visited in person, it was a different story. These weren’t “various shapes,” but a single shape rotating through their world. It was something that could not be explained with standard physics or three-dimensional vector calculus. However, one of the researchers on the site had created a model using a 4x4 matrix to illustrate how the shapes they saw could be merely the three-dimensional shadow of a fourth-dimensional object.

The sphere broke into a cold sweat when he heard those words. He had dismissed the idea of a fourth dimension as nonsense for years, and thanks to his reputation, none dared to challenge his beliefs. None, that is, except for one humble man who knew nothing of his reputation. One man who could see beyond his limited surroundings and imagine not only a third dimension, but a fourth as well. The sphere would have to do something he had never done before: Swallow his pride and admit he was wrong. It was time to return to Flatland.

The triangle was growing. Arthur Square did not recognize this triangle as one of his neighbors. It was perfectly equilateral and grew as large as Arthur himself. Suddenly, the triangle began to change, becoming- “Sir, why has a resident of the third dimension chosen to visit our world this day?”

Arthur heard a deep laugh echoing around him. “Well done, Arthur Square. Your insight is as sharp as ever.”

“Spherius! You’re back! Do you have more wisdom to impart?”

“Quite the contrary, I’m afraid. I need your help.”

Arthur jumped back. Help? “Why would someone as great as you need the help of a simple square?”

“You needn’t partake in such flattery, Arthur. The truth is we are both simple compared to the ones I am investigating. You see, you were right. There is a fourth dimension.”

“A fourth- but what has that to do with me? I can barely comprehend a third.”

“You are too humble, Arthur. You were able to conceive of things beyond even my imagining with nothing but pure logic. You are a better mathematician than I, and I need your help studying things I can’t see.”

“Well, that’s an interesting proposal, Spherius. But I suspect there’s more to it than that. I suspect you are also planning some experiments on me.”

The laughter echoed again. “You are correct on that, as well. I am an experimental scientist, and I need data to draw conclusions. I can’t conceive of what I look like to a fourth-dimensional being, but I do know what you look like to a three-dimensional being. If, together, we can shed light on the interaction between the second and third dimensions, perhaps we can extrapolate to learn how the third interacts with the fourth.”

“I see why you need me. And I’m always interested in a chance for learning. Where do we start?”


	2. On the Meaning of a Manifold

“The first thing to understand,” said Spherius, “is that your Flatland is not truly flat.”

Arthur frowned. “How can that be, though? I see northward, southward, eastward, and westward, but I see no upward or downward.”

“Because you cannot see them, Arthur. That doesn't mean they do not exist. Walk to the east.” Arthur turned and saw an empty field stretching out seemingly to infinity. He began to walk due east. “What you do not sense is that you are moving slightly downward as you walk east. It only appears flat to you because you cannot perceive the downward direction. But to an objective observer, your Flatland has many folds that create upward and downward regions- what we in Spaceland call 'hills' and 'valleys.'”

“I see. But how can you be sure that I am only moving eastward and downward?”

“Arthur Square,” huffed Spherius, “I have devices to measure directions in their absolute precision, and I have calibrated them so that the directions are at exact ninety degree angles. I can assure you that you are not engaged in any northward or southward movement.”

“Now, Spherius, I mean no disrespect, but I'm afraid I see a flaw in your logic. You see- what do you call this new fourth dimension?”

“Typically it's called a tesseract.”

“Yes, but instead of moving northward and southward-”

“Ah, well, the scientist who modeled the fourth dimension called it 'ana' and 'kata.'”

“Ana and kata,” repeated Arthur deliberately. “In that case, how can you be sure that I am not also moving that way?”

“I've told you that you're moving downward; if you were moving upward-”

“Anaward, not upward,” said Arthur firmly. “You would be unable to detect motion in the fourth dimension, just as I am unable to detect motion in the third dimension. For all you know, a being in the fourth dimension would say that your Spaceland is itself twisted in four dimensions. But then, a fifth dimensional being could say the tesseract was twisted in five dimenions.”

“But then-” Spherius voice trembled slightly, “but then there is no hope of understanding! All of our geometry is based on the idea that there are cardinal directions which provide an absolute frame of reference.”

“I don't think it's as bad as all that, my friend. You asked me to help you with pure mathematics. Well, in pure mathematics, much of what we use is a fictional construct. For example, the real number line is infinitely divisible- any segment, no matter how small, can be split in two, even though quantum physics says nothing is smaller than the Planck length. I submit to you that we can accept a hypothetical coordinate system that consists of any number of lines, each of which is a copy of the real number line, and all of which meet at right angles. This may not be something we can physically construct, but we can imagine it as a Platonic Ideal of real geometry.”

“But how does your hypothetical geometry help us study the real world?”

“I think it can be used to study it locally, at the very least. Watch.” Arthur picked up a stick and drew a circle around him, sealing himself behind a round wall. “To me, this is a circle. It has a fixed radius and every point on the circle is the same distance from I. Now, to you it may be twisted, but since Flatland only consists of two dimensions, we needn't trouble ourselves with its properties in higher dimensions. From a perspective inside Flatland, this is a circle.”

“So you are suggesting that we identify a circle in your two-dimensional world with a circle in R^2.”

“Exactly! I'll call this an open neighborhood, and I posit that it is identical to a circle in R^2 in all the ways that matter. Specifically, the circle in R^2 and the one here are both open.”

“What exactly do you mean by 'open?'”

“This is a notion from topology, which in its simplest form means that it has no boundary. I've drawn the edge of the circle (though I could have drawn any other shape with both length and width), but I will only consider points within the circle. Since our goal is to perform Calculus operations, we need a concept of 'limit,' which means it must be possible to approach a point in this circle from any direction. By dismissing the boundary, we ensure every point in the neighborhood is the center of a smaller circle that is entirely contained in the neighborhood.”

“Fascinating, Arthur. So now that you have this concept of neighborhoods, we should create a function that equates this neighborhood with one in R^2.”

“I agree, and since we want to preserve this concept of openness, we should use a function which is continuous and has a continuous inverse. These are called homeomorphisms.”

“All right, Arthur, so we wish to define a set of maps from R^2 onto your Flatland where each one preserves openness in both the forward and backward directions. So these maps must be one-to-one and onto, sending open balls in R^2 onto open balls in Flatland.”

“I agree, and furthermore there should be enough of them to cover all of Flatland. As for your Spaceland, you should be able to cover it with homeomorphisms from R^3. In fact, any similar space- wait just a minute,” said Arthur, suddenly concerned. “Are we sure that one of these new spaces can be covered with balls of the same dimension? What if it were two-dimensional in one area and three-dimensional in another?”

“Hmm, an interesting hypothetical. Practically, I don’t think it’s possible- I can always measure the depth of objects in Spaceland, though their depth might be quite small indeed. Well, unless the manifold were disconnected- we could have a circle and a line segment that are separated from each other, but then for practical purposes they would be two distinct manifolds."

“But in theory- could a two-dimensional object taper off to a simple line segment?” Arthur paced for a moment. “I suspect this is impossible to occur in any continuous sense- switching between dimensions would almost certainly cause a break in continuity, though proving it is non-trivial.”

“Well, let’s set it aside for the moment and assume our spaces are covered by balls of the same dimension. Then the space itself could be said to have that dimension, and each of the maps would be from one n-dimensional space to another. That is to say, each map into Spaceland would be a vector function with three inputs and three outputs. Flatland’s maps would be two-by-two, and Lineland could be covered by a simple set of open intervals. Hmm, that description reminds me of parametrizing functions in R^3- I think I’ll call these maps of open sets parametrizations.”

“Really? Since I see my local neighborhood as surrounded by a circle, I would have called them coordinate neighborhoods.”

“I suppose either term is adequate. A bigger concern is that these neighborhoods may overlap.”

“Not only ‘may,’ but must,” corrected Arthur. “Any path you travel through Flatland, or indeed any connected set, must always be in at least one neighborhood, and since open sets have no boundary, they must overlap in some places.”

“Exactly, and that is a concern. You see, a single area in Flatland might be covered by multiple different neighborhoods. My fear is that our choice of parametrizations could lead to radically different properties being applied to the same location, an ambiguity that simply cannot be allowed.”

“That is a good point, Spherius. We should ensure there is some smoothness in the overlap of these neighborhoods. Let’s add in a new requirement that when we transition from one coordinate system from another, we must do so smoothly.”

“You’re talking about differentiability.”

“Correct. Let’s say we have two sets U1 and U2 with associated maps **x 1 **and **x 2**, and suppose the images of U1 and U2 overlap in a set W. Then we should be able to move from U1 into W and back to U2 smoothly. Since the U’s are both in R^n (n is the dimension of the space), we have a map **x 2**-1o **x 1 **from R^n to R^n that is differentiable in the sense of ordinary vector calculus.”

“Brilliant!” shouted Spherius, rattling Arthur’s guts a bit. “By using our local coordinates, we are able to create maps that go through our space, but whose domain and range are both in the same Euclidean space! Now all of our familiar rules of calculus apply!”

“I’m glad you approve,” laughed Arthur. “We can say that we have provided Flatland with a differentiable structure. For technical reasons, I’d like to say that our structure is maximal, meaning that it includes all coordinate neighborhoods that meet these requirements. So if we take the union of different parametrizations, we should include any new parametrizations that arise.”

“I don’t see any problem with that. But we need a name for these new constructs- ‘a topological space with differentiable structure’ is quite a mouthful.”

“Let’s see… you said Flatland had ‘many folds,’ so why don’t we call it a ‘manifold?’ A differentiable manifold.”

“A differentiable manifold… I like the sound of that. So a differentiable manifold is any space that can be covered by the homeomorphic images of open sets in R^n, where the transition functions from R^n to R^n are all differentiable.”

“Well summarized, Spherius.”

“But we earlier called these topological spaces, so we need a topology. I suppose a set in a manifold is open if it is the image of an open set?”

“That’s close, but it’s not an ideal definition. It doesn’t take into account the issue of different coordinates, for one. Topology is typically more compatible with inverse images. I would say that a set A in M (the manifold) is open if its preimage is open under all possible coordinate charts. That is to say, if A intersects with some local neighborhood **x i**(Ui), then we pull back their intersection by **x i**-1 and get an open set in R^n. The point is that this must be true for any possible neighborhood that intersects with A.”

“Very well, Arthur, since you are the pure mathematician I will defer to your idea and add that to the list. So our manifolds truly are topological spaces.”


	3. How Arthur Entered the Tangent Space to Find Tangent Vectors

“I would like to explore mapping between different manifolds,” said Spherius. “After all, structures are not as interesting alone as they are in conjunction with one another.”

“Are there other Flatlands in your world?”

“Think it through, Arthur. How many lines are in Flatland?”

“I suppose there are an infinite number of them, plus the many curves I could draw that would be locally homeomorphic to lines- ‘one-dimensional manifolds,’ to use our new terminology.”

“Correct, so by analogy there are infinitely many two-dimensional manifolds in Spaceland. I’ve found one nearby that is convenient for our purposes- it is uninhabited but perfectly safe for you. I want to observe your motion as you pass between the two of them, but from your perspective. I’ve created what we call an ‘elevator’ between the manifolds, so you and the space around you will be lifted- that is, translated through the third dimension to place you on a new manifold. Are you ready?”

Arthur took a deep breath to steady his stomach, observing the space around him and memorizing the location of the specks of dust and dirt. “All right, do it.” He expected a jolt or some pain, but it was almost unnoticeable. The circle stayed around him but slowly the objects spread apart.

“There, how was that?”

“Fine,” said Arthur. “Things are farther apart, but the transition… I’d say it was ‘smooth.’”

“Interesting, and perhaps a telling word. The circle appears larger to you, so we could draw an R^2 coordinate map around you and it would be twice as large as the one you started with.”

“Yet that is entirely subjective,” said Arthur. “You could easily scale that map down so that the map between local coordinates was the identity map.”

“We could, but that is not much more than a notational choice. The doubling map and the identity map should both be smooth if we are to be consistent with classical calculus. Now let’s attempt a mapping that is not smooth. Brace yourself, Arthur.” This time, Arthur felt as though he were being crushed. Half of him jerked in one direction while the other half pulled in another then folded over on itself, and he felt pain split through his body.

“Never do that again!” shouted Arthur.

“Oh, I’m so sorry, my friend! I had no idea it would be so sharp.”

“What kind of mapping was that?”

“It’s hard to describe precisely. Manifolds are difficult to measure, after all. If we were to compare the local coordinates where you started and ended, it would be the absolute value function in both coordinates. You were at the origin, so both your horizontal and vertical dimensions were being pulled apart.” (Figure 3.1)

“I see,” said Arthur, slowly regaining his composure. “You have a point there- it’s easier to talk about mappings between the local coordinates than the manifolds themselves. So a smooth map should be one where the map between local coordinates is itself smooth.”

“I vaguely see what you mean, but can you spell it out a bit more?”

“Consider this: I will name the spot where I am the origin, and mark distances around me.” Arthur scratched a rough coordinate system in the dirt. “Now this coordinate system defines a local map from R^2. We then do the same in the target of this space, assigning it a map from R^2. To me, I see two separate coordinate systems, so I can define where each point around me starts and where it ends. Then we have a map from R^2 to R^2, and so long as my neighborhood lands entirely inside a single neighborhood in the new space, that map can be judged to be differentiable or not.”

“And we evaluate differentiability based on matrices,” said Spherius, reaching back to his studies of vector calculus. “In single-variable calculus, a derivative is a linear operator that gives the slope of a tangent line. So a mutli-variable derivative should also be linear- i.e., a matrix multiplication- and should also approximate the change at any point.”

“’Any point’ may be too much to ask. Given how complicated some of these manifolds may get, and how many different coordinate neighborhoods we might have, I doubt we’ll always be able to find a single operator that works for every point. Rather, I say that we fix a point first, and then find the matrix that is the best approximation for the mapping’s action at that point. For example, say the earlier mapping doubled the distance of points in my view. Then we’ll call the map itself **phi** and the coordinate maps for Flatland and the other manifold **x** and **y** , respectively. So we could say the map between coordinate systems is **y -1 **o **phi** o **x** , where o represents composition of functions. The overall map would take any vector **v** to 2 **v** , so it is already a linear map. Some other mapping might not be linear but can still be approximated as linear at a specified point, so we could call it differentiable at p.”

“I see your intent, but I am concerned that this definition is too localized. Suppose I choose a different coordinate system than you. How can we ensure a map you judged differentiable will not have problems in my coordinates?”

“Ah, but you forgot, we defined manifolds so that transition functions are always smooth. So we could simply compose my maps with a smooth transition function to get your maps, which would still be smooth.”

“Of course!” said Spherius. “So the way I express a map may be different from the way you do, and yet mine will be smooth still. I think we should call the specific choice of coordinate maps an _expression_ of **phi**.”

“All right, let’s do that. So a differentiable map **phi** : M1 -> M2 is one where… let’s see, in calculus we usually start with conditions on the codomain and then make suitable choices in the domain. So into the neighborhood we chose earlier we’ll start with a point p in M1 and say that if **y** maps some open set V in R^m onto an open neighborhood of **phi** (p) in M2, then we can create a map **x** that sends an open set U in R^n onto a neighborhood of p. If we choose our U and **x** carefully, **phi** will map the set **x** (U) into the interior of **y** (V), and then if we compose all of those maps, we get a differentiable map.”

**phi** ( **x** (U)) is contained in **y** (V) and **y -1 **o **phi** o **x** :U -> R^m is differentiable (p is in U which is in R^n)

“We so far have only discussed maps between two-dimensional manifolds, but this should be fully generalizable to mappings between n and m dimensional manifolds. So, what’s next?” asked Arthur.

“I think I would like to deal with the concept of velocity. Tell me, Arthur, how would you define velocity in Flatland?”

“Interesting question. The simplest answer is the change in position over change in time, but it’s more complicated for objects that accelerate. Let’s start with a picture.” Arthur drew an arc in the dirt. “As a bug runs along this curve, its velocity is changing. Even if its speed is constant, velocity is a vector quantity, so we must account for direction. Now let’s say at this point-“ Arthur marked a spot midway up the curve- “the bug jumped off the curve. The easiest path to take would be a straight line tangent to the curve.” He then drew such a line. “The line can be parametrized as a vector multiplied by time added to another vector that represents the initial position. We would call the multiple of time **v** the ‘velocity’ at the point.”

“Very good, Arthur! And how many such velocity vectors are possible?”

Arthur frowned. “Let’s see, I suppose the bug could in theory be running at any speed, so the vectors would be any positive multiples of **v**.”

“Ah-ah, the bug could run in the opposite direction.”

“Oh, of course,” said Arthur sheepishly. “So it would be all scalar multiples of **v** , including 0 if the bug stopped moving altogether. Then… they make a line!”

“Brilliant! So all the possible tangent vectors at that point, or indeed any point on the curve, will form a line, which we call the _tangent line_. Yet this line would, to a one-dimensional being, simply look like another line identical to the curve it just left. So let’s see what happens when you leave Flatland.”

“But I am bound to Flatland unless you lift me up out of it, so how could I leave under my own power?”

“It’s simpler than you think. There is an adhesive force that sticks you to Flatland. All I have to do is cancel that force at one point, and you will enter Spaceland in whatever direction you were already moving.”

“Is that… well, is it safe?”

“Arthur, don’t you trust me?”

Arthur thought back to the non-smooth mapping from earlier but held his tongue. At a point ahead of him, a bright light began to glow.

“Run at the light, Arthur. Run as fast as you can.”

Arthur took a deep breath and stepped back, then charged full speed at the light. As he hit it, he charged right through and suddenly everything looked different. He slowed to a stop and looked around. The world was a mix of gray and blue, except for the glowing point he had come from. “Where in the Mandelbrot Set am I?”

Spherius chuckled. “You are in a plane- a flat two-dimensional space. At least, it is flat to my eyes. A four-dimensional being might disagree. Regardless, you have left Flatland and are now on a plane tangent to it.”

“And what does it mean for two manifolds to be tangent to one another?”

“I trust in your capacities of reason, Arthur. Work it out.”

“All right. In our earlier example, we had a line that was straight tangent to a curve. That means it was a one-dimensional space that, in two dimensions, was close to the curve as long as you did not move too far from the point of tangency. So this new plane must be close to Flatland from your perspective, but it gets farther away as I move from the glowing point. But that leaves me with a question- what is ‘flat?’ Before, you called it the property of being unable to move up or down, so are you saying this plane also does not move up or down?”

“No, it actually has a bit of a tilt from my perspective. To be flat… well, I suppose it means that its rise and fall in the third dimension is constant with respect to motion in the other two dimensions. If you walk northeast from any point in this plane, you will rise at a rate of about 1 foot for every 5 feet of walking.”

“But that means… that means that this plane is linear! There must be a single matrix such that any vector in two-space can multiply by the matrix to give its motion in the third dimension!”

“I think you’re right. We call that matrix (although it’s really just a row vector) the _gradient_. If x and y are the coordinates in two-space and z is the third coordinate, then the vector <dz/dx, dz/dy> is the gradient, and any flat surface must have a constant gradient.”

“It’s better than that, Spherius. Say we have y as a function of n variables {x1, x2,…, xn}. Then we can imagine the graph of y as an n-dimensional surface in R^(n+1), the same way that the curve I drew earlier was a one-dimensional object in R^2. Then we could say that the gradient <dy/dx1, dy/dx2,…,dy/dxn> being constant would make something like a plane in R^(n+1). So instead of saying a constant gradient is a property of flat spaces, let’s simply define flat to mean any n-dimensional space in R^(n+1) that has a constant gradient. This is like a plane in higher dimensions- let’s call it a hyperplane.”

“I would call this an _affine hyperplane_ , with ‘affine’ describing the way it skews away from Flatland, but I think in most cases it will be understood. And if the gradient is constant, that means we could see the hyperplane described by a linear equation y = a1x1\+ a2x2 +…+anxn, where the a’s are all constant.”

“Why, that’s true as well. So let’s return to the discussion of this strange plane I’m in. I’m going to set a marker here.” He took an orange pebble he’d been carrying and laid it down. “I’m now going to return to Flatland and enter from a new direction.” Arthur went back to the glowing point and returned. He then backed up at an angle about 120 degrees from his first path and this time walked slowly towards the light. Once again, he was in the blue-gray space. He looked around and saw the orange rock. “There it is! So I am in the same space!”

“I think it’s a necessity. No matter what direction you leave Flatland in, as long as you leave from that point you’ll be in the same plane. So we might as well call this the _tangent plane_. If the marked point is called p, and Flat land (or any other manifold) is called M, then we’ll denote the tangent plane by TpM.”

“And I can cover all of this space by entering from different angles. In addition, I can change my speed, so all the different ways I enter correspond to every possible vector in R^2. But are you sure there can only ever be one tangent plane?” asked Arthur. “Think about the absolute value function. At its vertex, a bug could enter two different lines depending on which direction it approached from.”

“True enough, and if you were on a cone or a pyramid… well, you won’t be able to picture those, but you could enter many different planes at their peaks. However, those all have in common that the derivative does not exist. I think that, as long as our manifold is differentiable at p, there will only be one tangent plane at p. I also notice that the dimension of the tangent space is the same as that of the dimension of the space it is tangent to.”

“I noticed that, as well. I traveled along a vector in R^2, and then I departed from Flatland, but my vector in R^2 was the same. Essentially, you’re canceling the part of my path that was in the third dimension and replacing it with a linear multiple of my velocity. Since the tangent space has a constant gradient, and since the gradient is equal to my velocity at the point of tangency, it must be the same dimension. Therefore the tangent space has to be the same dimension as the original space.”

“Excellent! And while the tangent space may be tilted in my local perspective, it is congruent to a flat plane. I could simply rotate it to lay flat, meaning that, once again, its motion in the third dimension is purely a local illusion. In the same way, one of your tilted lines could be seen as a straight line from another angle, so it’s a one-dimensional manifold embedded in a two-dimensional manifold. So we now have the concept of a tangent space, and we should be able to explicitly define a tangent vector. The obvious meaning is that it is the speed and direction at which you leave Flatland, but…”

“But we need to formalize it. Let’s call the path I ran along in Flatland **alpha**. Since it’s one-dimensional, it must have a parametrization **alpha** (t)so that it is the image of a line segment in R^1. I’ll say the point of tangency is t=0 and I’ll allow both positive and negative values of t, that way the curve passes through the point and I can approach from either direction. So **alpha** : (-c, c) --> M, where M is Flatland or any other manifold, and we’ll say **alpha** (0) = p.”

“So far this all seems like standard calculus, except that we are on a manifold instead of a Euclidean space. But if M is an n-dimensional manifold, then the image of **alpha** is in M, rather than in R^n, and it’s hard to say how the derivative would be defined.” Spherius spun on his axis as he thought. “We should probably be able to address this issue using local coordinates, but let’s refresh ourselves on the rules of calculus. Suppose **alpha** mapped R into R^n by **alpha** (t)=(x1(t), x2(t),…,xn(t)). Then each of these coordinate functions are independent.”

“Why would that be so, Spherius?”

“Well, because R^n is a vector space and so each dimension is orthogonal to every other one. Therefore each coordinate function is real-valued, and we can assess the differentiability of each separately. So the derivative **alpha’** (t) = (x1’(t), x2’(t),…,xn’(t)). Then the tangent vector at t=0 is (x1’(0), x2’(0),…,xn’(0), which is simply a vector in R^n.”

“This is coming back to me,” said Arthur. “I must confess, some of the details of calculations have slipped my memory- a combination of old age and lack of use. You made a point that we are working in a vector space, so we can write the vector you found as a sum of basis vectors, correct? But what is the basis?”

“The most obvious choice is the partial derivatives themselves. A vector, of course, could be viewed as an arrow in space. The _directional derivative_ at a point, then, would be the sum of each partial derivative at that point multiplied by the magnitude of the vector in that component. For example, if f(t) = (t^2, 3t^3, -2t) and **v** were (2, -1, 4), then the derivative of f in the direction of **v** is (2*2t, -1*9t^2, 4*-2) = (4t, -9t^2, -8), which can be evaluated at a point t. Or we can combine f with **alpha** to get a function from R to R, and the directional derivative is d(f o **alpha** )/dt evaluated at t=0.”

“True, but so far it’s seemed best to fix a point first and then explore the local derivative. The vector **v** comes directly from evaluating the derivative of the curve **alpha** at t=0, so the directional derivative is the dot product (x1’(0), x2’(0),…,xn’(0) **.** (d/dx1,…,d/dxn). We can then apply this operator to any function we choose.”

“Slow down, Arthur. This is getting to be too many layers. How did our derivative become an operator, and what functions are you applying it to?”

“It’s really quite simple, Spherius,” said Arthur, trying not to gloat at outthinking the sphere. “You see, if we fix a point **alpha** (0) = p, then we can find the tangent vector of **alpha** itself at that point- it would just be the **alpha’** (0) from above. But if we treat the coordinates of **alpha’** (0) as the scalar coefficients of each partial derivative, then we have a new derivative operator that sends functions of R^n to real numbers. That is to say, we can take any real-valued function that is differentiable at p, and we can use the partial derivative operator to tell how quickly the function’s value is rising or falling in the direction **alpha’** (0). In this sense, **v** is not only a vector, but it is a _functional_ , i.e., a mapping that takes functions to real numbers based on their behavior near p.”

“I think I see, though I may wish to practice a few examples later. Another way of thinking is that f is specifically being evaluated along **alpha** , so instead we could take the function f o **alpha** and take its derivative at 0. I suppose it would amount to the same thing. What strikes me as most important is that, once we know **v=alpha’** (0), then we have fully defined this operator. Then effectively we can feed it any function f and receive a real number representing the rate of change along that vector.”

“Then that should be our definition,” said Arthur firmly. “When generalizing from a concrete case, the most crucial properties should become the new definition. Given some differentiable manifold M, **alpha** : (-c,c) --> M is a curve in M, and we can define the _tangent vector to **alpha** at t=0,_ **alpha’** (0), as an operator which sends functions differentiable at p to R by **alpha’** (0)(f) = d(f o **alpha** )/dt evaluated at t=0. And each of these tangent vectors is a member of the tangent space TpM.”

“Very well, but I still would prefer to deal with this in local coordinates. It’s much easier for us practical-minded fellows. Let’s say we have a parametrization **x** : U --> Mn with **x** ( **0** )=p (note 0 is now the zero vector, since U is a subset of R^n). Then for a point q=(x1, x2,…,xn) in U (so q is near **0** ), we can say f o **x** (q) = f(x1, x2,…,xn), and now f o **x** :R^n --> R, meaning we no longer have the troubling question of defining f on an abstract manifold. And the curve **alpha** is in M, so we should pull back along our parametrization so its values are in R^n: **x -1 ****o alpha** (t) = (x1(t),…,xn(t)). Then we can recover our earlier definition so that **alpha’** (0)(f) is the derivative of f composed with **alpha**. It ends up being a simple application of the chain rule, so that the directional derivative is SUM (xi(0)*d/dxi)(f). Then **alpha’** (0) = SUM (xi(0)*d/dxi).”

“I think that’s it.” Arthur yawned involuntarily. “Excuse me. It’s been a long day, Spherius. What say we pick this back up tomorrow?”


	4. How the Jacobian Became the Differential

Arthur was sketching a curve in the sand when Spherius appeared.

“Good morning, Arthur!”

“Hello, Spherius,” said Arthur absentmindedly as he drew the x and y axis and marked a point on the curve. “I was thinking more about differentiability.”

“Not even a ‘did you sleep well,’ I see. Very well, let’s get to it.”

“You see, we typically think of a graph as the solution set of an equation, but what if it’s more than that? What if we think of the graph as a one-dimensional manifold in two-space?”

“Certainly we can do that.”

“Well then, if we focus on this point ‘p’ that I’ve marked, we could see it as a mapping from one manifold to another. We could take horizontal vectors and transform them to vertical vectors. Look, the derivative here is 2, so take a small horizontal vector h. Then if you add to it the vertical vector 2h, you have approximately the new output of the function.”

“2h is approximately f(x+h)-f(x), though the estimation becomes cruder as the magnitude of h increases. So far this is nothing more than basic calculus.”

“Right, but the key point is that the ‘times two’ operation is linear. Calculus would never have us square the horizontal vector, for example. So if we were to try to do a similar operation between R^n and R^m, we would multiply a small magnitude vector based at p by an nxm matrix, since matrices are the natural multi-dimensional analogue to linear maps.”

“So far, so good,” said Spherius. “We would get a different linear map at different points, but if we fix a point then it’s a specific matrix. We call it the _Jacobian_ , and it is simply the matrix of partial derivative of the function f evaluated at the point of interest.”

“And there it is!” said Arthur triumphantly. “In multi-variable calculus, we can linearly approximate functions with a matrix of partial derivatives. But yesterday, we discovered that the tangent space of a manifold at a point is a vector space with a basis of partial derivatives. So we should be able to carry over the notion of the Jacobian to a map between manifolds!”

“By Gauss, you might be on to something. We need to explore this. Hold on.” Spherius’s whirring became distant briefly and then got louder again. “All right, I’ve set up an elevator between Flatland and another manifold. It’s perfectly smooth, so we should get some interesting results. You’re at the center of the neighborhood so you’ll have an interesting perspective.”

Arthur felt his insides stretch as he was carried to the new manifold. “That was… odd. I can’t describe it exactly, but I feel… wider?”

“That sounds right,” said Spherius wryly. “Your width has doubled, as has everything else near you. That was the linear map {(1,2),(1,2)}, so it doubled your width but left your length alone. You’ve become a rather fat man now!”

“Very funny, Spherius. Would you mind reversing it?”

“All right, be patient Arthur Rectangle.” He felt the transition reverse itself as he returned to Flatland. “So since that map was itself linear, its Jacobian was simply itself. It would be the same at every point. For more complicated functions, the motion will be less predictable. So we have to think about what happens to points far from the center of the neighborhood.”

“Perhaps, but I think we might be better served by discussing vectors rather than points. Any vector can be thought of both as a point in space and as motion through space, and I think the latter view is more profitable. We talked a lot about velocity vectors defined by curves, so let’s think about what happens to the velocity of something in motion as it is transformed by a differentiable map.”

“With the earlier map, their horizontal speed would double while the vertical speed would remain constant.”

“Let’s do an experiment.” Arthur set up five pebbles in a row and picked up a long stick. “You set up one of your elevators and I’m going to send all these pebbles rolling. When the center one hits the center of your neighborhood, activate the elevator.”

“All right, I’m creating a smooth map.” A light appeared a few meters away. “This map is non-linear, but at the lighted point its Jacobian is the matrix that rotates 120 degrees and scales by a factor of 1.5.”

Arthur tapped the pebbles and watched them roll. At the instant the center one hit the light, he saw them begin to twist. Arthur felt a lurch in his stomach. It was hard to judge the action from within, but at the other side the pebbles were all moving at different speeds and directions.

“Success!” cried Spherius. “The center pebble did, in fact, rotate 120 degrees and scale by a factor of 1.5, but the others took very different routes.”

“But the ones nearest the center close, while the edge ones were much different. Exactly as predicted.” Arthur paced a bit as he thought about the implications. “So we are essentially taking a map of points and inducing a map between velocity vectors. In other words, this is a map between the tangent spaces of our manifolds! Given a map phi between manifolds M and N, we are describing a _differential_ map from TpM->T **phi** (p)N.”

“To be consistent, we should define this map in terms of how we earlier defined velocity vectors. So we have a curve **alpha** in M, where **alpha** (0)= p and **alpha** ’(0)=v. Then the image of **alpha** will have to be some curve in N, so we have to define a new map of an interval into N.”

“Let’s not overcomplicate things, Spherius. Since **phi** is a smooth map, we should have no trouble define this new curve as **beta** = **phi** **o alpha**. Then the differential map at p could simply be defined as **b** ’(0) = d **phi** p(v). Since the vectors are being transformed linearly, the differential will be the linear combination of the partial derivatives, i.e., the Jacobian. Except… except for the fact that we should have a unique differential map, and we’ve defined it relative to an arbitrary curve.”

“I don’t think that will matter. After all, the choice of curve was only a tool to give us a vector. If the map is properly linear, then every vector through p should be transformed the same way.”

“Ah, I think you’re right. If we evaluate the action of a linear map on a single vector through p, then the same action should apply to every vector through p. Let’s consider this in local coordinates and see if we can get something more concrete.” Arthur began sketching on a slate. “So we need to define our neighborhoods and their charts by **x:** U->M1n and **y** :V->N2m. Then we can express **phi** locally as **y -1 o phi o x**, which would map R^m to R^n. **y -1 o phi o x**(q) = (y1(x1,…, xn),…,(ym(x1,…, xn)).”

“Wait a minute,” said Spherius, “are the x’s and y’s points or functions?”

“A little of both, I suppose. (x1,…, xn)=q is a point in U, so it’s near p. The y’s would be the output of the **y** -1 function, which are each real numbers, but they are written as functions to emphasize that their values depend on all of the x’s. I suppose the most accurate answer is that each yi is a coordinate function of **y** which depends on all coordinates of q, and together they are a point in R^m near **phi** (p).”

“Okay, I think I follow. Then **alpha** is a parametrization that gives points in the manifold M1, so we could compose **x** - **1** **o alpha** = (x1(t),…, xn(t))to give a function from R to R^n.”

“And then **beta = phi o alpha** , so we combine all our work to yield **y -1 o phi o x o x-1 o alpha **=

**y -1 o beta** = (y1(x1(t),…, xn(t)),…,(ym(x1(t),…, xn(t))), which is the desired map from a segment of R to R^m.”

“And from there, since our function has one-dimensional real inputs, we can use the gradient to define the derivative in each coordinate. Then each yj gives a column vector by taking each of its partial derivatives and multiplying them by the scalar xi’(0). So the ij coordinate in the differential matrix is just (dyi/dxj)(xj’(0)). And, Arthur, I believe this addresses your concern about uniqueness, since our final definition in no way depends on our choice of **alpha.** ”

“Well, how about that? It does depend on our choice of local coordinates, but thanks to the smoothness of the transition functions it would still be linear in any system. This is really excellent. We’ve fully generalized the idea of linear approximation (the key to all of calculus) for general maps between manifolds.”

“Here’s a question,” asked Spherius. “Suppose the differential was… well, let’s say perfect.”

“What do you mean by perfect?”

“I mean that it preserves all data, that it sends all vectors to all other vectors and the original vectors can be recovered.”

“Ah, you mean the vector spaces are isomorphic. They’re bijective and preserve all relevant operations, namely vector addition and scalar multiplication.”

“Right, then if that’s the case, what can this tell us about our original function?”

“Good question. For starters, only square matrices can be isomorphic, so the manifolds must be the same dimension. The matrix would have to be non-singular, so the inverse exists. But then… ahh, I see. We can apply the Inverse Function Theorem.”

“I am vaguely familiar with that concept,” said Spherius, “but not its full meaning.”

“The holistic idea is that if you have a complicated function that you approximate linearly, then if the approximation is solvable the original is as well, but perhaps only locally. For example, if, under the differential, a vector were sent to its rotation, we know there is an inverse matrix that could recover the original vector. Then the map **phi** itself might be too complicate to solve explicitly, but we would know that an inverse exists and that each resultant vector in a certain neighborhood has a unique preimage. But if that’s the case, then that means the inverse map must be differentiable as well, so locally the map **phi** is a diffeomorphism!”

“Say that one more time, Arthur?”

“I see. And what if the differential is not an isomorphism?”

“I have a feeling that will open up a whole new world of possibilities.”


	5. Concerning Immersions and Relative Motion

“Earlier, you said Flatland was ‘embedded’ in Spaceland,” said Arthur. “What exactly did you mean by that?”

“I mean that… well, I suppose I simply mean that it’s contained in it, but it has its own independent structure.”

“Interesting, interesting. If one manifold is contained within another as a subset, then there is an inclusion map from the smaller to the larger, which as vector spaces could be seen as simply copying each vector into the new space.”

“Why are we discussing vector spaces?” asked Spherius. “We have no idea what kind of maps could be drawn between our manifolds, so we can’t assume they’re linear.”

“That’s true, but we do know that the differential map is linear. So perhaps we can start by considering its properties and then transferring them to the original function.”

“All right, you said we should copy vectors into a new space. You’re saying this ‘inclusion map’ would be the identity matrix?”

“Not exactly. Flatland is only two dimensions while Spaceland is three. If we wanted to map Flatland into the same position from the perspective of Spaceland, we would need a 3x2 matrix, which would be the 2x2 identity matrix with an extra row of zeros. This way the first two coordinates would be the same, but the third would be zero."

"The third would not be zero," objected Spherius. "I told you at the beginning that Flatland is not flat, and in fact its height changes as we move across it."

"True, true," said Arthur. "And yet, we may be able to treat it as such despite that setback. Any neighborhood in Flatland is locally equivalent to a subset of R^2, after all. We could just as easily say that it is equivalent to a circle in R^3. While that may be at some odd angle from your perspective, it would be trivial to rotate and translate it so that it lies in the xy-plane. So I would say that, if I can identify a circle with my neighborhood, you can easily maintain my xy-coordinates and simply declare the z-coordinate to be 0 for each such point."

"I suppose that is acceptable. I am more used to judging things against an objective frame of reference, but more and more I am realizing there is no such thing."

"Right, and moreover this simplification makes the calculations much simpler. Each vector would be carried to itself with a zero appended to the end. Then the inclusion map is i:M->N by {(1,0),(0,1),(0,0)}. I would go so far as to say this map makes Flatland a _submanifold_ of Spaceland.”

“If you’re going to start making such definitions, then judging by your past definitions you’ll need to be sure they preserve all the important properties. Continuity, differentiability…”

“Good thinking, Spherius. Continuity will be taken care of by using the induced topology. We simply define the preimage of open sets as open, and voila, the map is continuous. But I think we also need the inverse map to be continuous, so we’ll say **phi** is a homeomorphism from M to N. It’s continuous in both directions. We can then say that **phi** embeds M into N provided **phi** (M) is a subset of N and it’s a homeomorphism between M and **phi** (M).”

“That’s good as far as it goes, Arthur, but I’m concerned that it’s an incomplete definition. You see, as you have defined it, you could ‘embed’ a three-dimensional space inside a two-dimensional one, which strikes me as absurd. I think we must also insist that the dimension of the codomain is at least that of the domain.”

“I see your point. Let’s do an experiment on one of these. Here, I’ve got two pebbles. I’ll give them a smack to get them all moving in different directions, and then you lift us to a new space. There should be no problem embedding a two-space into another two-space.” Arthur hit the pebbles and watched them move. They spread out from Arthur at a forty-five degree angle, and as they were lifted the pebbles started to turn towards one another. In the end, the first was moving much faster while the second was slowly rolling towards its path.

Arthur and Spherius repeated the experiment three more times and got similar results, but the fourth was different. The pebbles began rolling at a forty-five degree angle, but when the elevator stopped they were moving exactly parallel. “Wait a minute, wait a minute. What map was that?”

“Er, it’s a bit complicated. The differential at the center was {(1,2),(1,2)}.”

“But that’s the same row twice.”

“Well, yes, I suppose it is. What’s the significance of that?”

“Think about it Spherius. We had two pebbles moving skew to each other, and they ended up parallel.” (Figure 5.1)

“So?”

“So relative motion is destroyed! In the earlier ones, the relative motion of the points may have altered, but it still existed. Now there’s not even a trace.”

“I’m still having trouble seeing what’s so important about that.”

“What’s important is that these are differentiable manifolds, meaning they embody change. Certainly these changes should be preserved if we want to say the manifolds are in any way identical. But if they no longer move at angles to one another, that means they no longer span the space. In essence, these two pebbles are not even really in the same two-space, they are in parallel one-spaces.”

Spherius was quiet as he thought it over. “What you’re talking about is relative frame of reference. Our scientists have discussed the idea that motion is relative, meaning that it should be insignificant what frame of reference you consider. Two people sitting at opposite sides of a car may disagree whether it’s moving left or right, while someone in the car may think they are moving. But it would be unacceptable for them to think that they are all standing still. We need to ensure that our embeddings preserve this essential property.”

“Good, we’re on the same page now. But I think we should consider it in terms of the differential. After all, we spent a lot of time analyzing it, so it should be part of the discussion. Let’s revisit the differential of this map: {(1,2),(1,2)}.”

“I see one obvious property: the determinant is 0.”

“Absolutely. And that is the key to our problem. Because the determinant is 0, skew vectors are mapped to linearly dependent vectors, so we cannot recover their preimages.”

“You’re saying you want the differential to be invertible.”

“Precisely. I feel that if you took a busy street corner in Flatland and then moved it to a new space where everyone was stopped, we could not meaningfully say that we have preserved the character of that neighborhood. (Figure 5.2) Their relative motions should be the same, though perhaps rotated and scaled, as linear maps are wont to do. So since the differential is invertible, it must also be injective. And since we said earlier that an embedding is from one space to a space of equal or greater dimension, the differential will be an invertible matrix with possibly extra rows of zeros at the bottom.”

“I feel this can apply to more than just embeddings,” said Spherius. “I think the property that relative motion is preserved is important enough to warrant its own definition. It tells us that two independent vectors cannot be mapped to collinear vectors, or in full generality: if a set of vectors span an n-dimensional vector space, then their images also span an n-dimensional vector space. So let’s say that an _immersion_ is a differentiable mapping whose differential is injective (or one-to-one), and if in addition the original mapping is a homeomorphism onto its image, then it’s an embedding. Then a submanifold is when the inclusion map itself is an embedding.”

“Impressive, Spherius. We may make a mathematician out of you yet.”

“I’m not sure whether that’s a compliment or a threat.”

“Wait- the differential is tied to a specific point, right? So what do you mean when you say ‘the differential’ of a map?”

“Ah, I see what you mean. I would say that if we want to think about frames of reference being relative, then we should have relative motion preserved at every point in the space. In other words, to be an immersion we should require that the differential at every point in our manifold is one-to-one. Of course, it would be nice if immersions were also embeddings so we could rely on topological properties, but that was probably too much to hope for.”

“Maybe not,” said Arthur. “Remember the Inverse Function Theorem? If we pick a single point, focus on only the submatrix that is invertible and ignore the extra rows, then it is an isomorphism. But we saw earlier that since the differential is an isomorphism, then the mapping is locally a diffeomorphism. But that means that it has all the properties of continuity that come along with differentiability, so it is in fact an embedding, at least locally.”

“So even if the mapping as a whole is discontinuous or non-invertible, it will at least have those properties on small neighborhoods about any point. And that follows directly from our definition of immersion; that’s really something. One other thing bothers me, though: we’ve been talking about tangent spaces at a point. What if we do want to consider the entire manifold’s tangent space? That is to say, we want to be able to think about all the possible points and the possible velocity vectors at those points?”

“That shouldn’t be too difficult. We can just take the direct product of the manifold and its tangent spaces. Let’s say TM = {(p,v) : p is in M, v is in TpM}. So we bundle all of these tangent spaces together in a new structure.”

“A new structure… do you think this _tangent bundle_ is itself a manifold?”

“I suspect it is. Let’s say we had a neighborhood (Ui, **x** i). Then we can map the pair of a point and a vector based at that point into this new structure as **y** i : Ui x R^n -> TM by mapping the coordinates of the point in M according to **x** i and then use the coordinates of the vector **v** as the scalar multiples of the partial derivatives. In essence, the tangent spaces are transformed into a new basis, but the scalar multiples of each entry remain the same. It will probably take me a little work to prove it, but I’m pretty sure that will turn out to be a differentiable structure in its own right.”


	6. How Vector Fields Interact with Brackets

“What happens when a force from outside acts on Flatland?”

Spherius spun on his axis as he thought about what Arthur had said. “Well, there's always some force acting on Flatland. Gravity, for instance, which cause the rain to fall southward.”

“Yes, but you can do more. What if you were to push all of us here in Flatland?”

“And why would I want to do that?”

“You said you want to explore the interaction between different spaces. How else can they interact except through pushing and pulling?”

“You have a point there. Let's see... I suppose pushing you through Flatland would cause you to move. Let's try.” Spherius reached down and shoved Arthur. He watched as the square coasted along the curves that only he could see. As Arthur breached a hill, Spherius would have expected him to fly off in a straight line, were it not for the mysterious adhesive force that held its inhabitants on the surface. A straight line... a straight line... “Eureka! When I push you, your motion is determined by the instantaneous velocity as I release. But we already know that that velocity is represented by a vector in the tangent space. So a force from the outside will assign to each point a velocity vector in the tangent space to that point.”

“In that case, we should say that the force can be represented as a _vector field X_ on M. When writing, it will help clarify if we use capital letters to represent vector fields.”

“So X it assigns each point p in M to a vector X(p) in T p  M. In other words, X is a mapping from M to T  p M.”

“Close, Spherius, but it won't quite do. Remember that the various tangent planes may be isomorphic to one another, but they are distinct spaces. So to be precise, X maps M to TM, but the first coordinates are simply a copy of the coordinates of p. X(p)=(p,v).”

“Very well, since you mathematicians like to be picky.”

“Hey, I'm not the one who'd have to deal with skewed planes intersecting.”

“The first half of this map is obviously trivial, but let's focus on the second half. What would be the form of this mapping? I suppose it's another vector space function.”

“That seems plausible. Let me draw some coordinates.” Arthur sketched a few marks in the ground to form a crude axis, and drew a dashed circle around the outside. “So now I have a system of local coordinates for this neighborhood U of Flatland. We can then pick any point- say, this point, (1,-2). So your push can be expressed here as its horizontal and vertical motion, and since the directions are orthogonal, we can compute them independently.”

“Correct, we could say the horizontal motion is ax  ((1,-2)) and the vertical is ay ((1,-2)). Then, using our earlier conceit of partial derivatives as a basis for the vector space, we can say

X(p) = ax  ((1,-2))(d/dx) + ay ((1,-2))(d/dy).”

“Technically you left off the 'p' coordinate at the front, but I think we can let that slide as long as it's clear that it's always implicit. More generally,” said Arthur, “in an n-dimensional space, X(p) = SUM (ai  (p))(d/dxi  ) for all n summands. Then each ai  is a real-valued function on U, and they form the scalar factors of a linear combination.”

“Would it be fair to say that, since the mapping is the sum of functions on U, it is differentiable when its component functions are differentiable?”

“I can't see why not. Differentiation is a linear operator, after all, so we should be able to differentiate term by term. An operator...”

“Is it your turn for a 'eureka' moment, Arthur?”

“I was just thinking that X(p) is a combination of real-valued functions and partial derivatives, but they are sort of 'free-floating.' You can imagine a bunch of engines revving, just waiting to be given a function to work on. So maybe we can think of vector fields as operators that take functions on M to other functions on M.”

“Interesting idea. I suppose for that to work, the input function would have to be differentiable, but there's no telling what the output functions would be. It's similar to the standard derivative in calculus. The function f(x)=x 2  is a function from R to R, and the derivative changes it to f'(x)=2x, which is still a function from R to R. Likewise, if we feed a function on M to a vector field, we should get a new function on M.”

“So X:D->F, where D is the set of differentiable functions on M and F is the set of general functions on M.”

“Then if that were the case, Xf(p) would be the linear combination of partial derivatives we expressed before, but applied to f one by one.”

“Ah-ah, Spherius, remember f is a function on M, while partial derivatives are defined on real numbers.”

“Must all mathematicians be such pedants, Arthur Square? You know perfectly well that I am speaking of a function that has been parametrized to your neighborhood U.”

“All right, Spherius, I supposed we can 'abuse the notation' just this once. With that in mind, Xf(p) = Σ (ai(p))(df/dxi)(p), and each summand is one coordinate of the tangent vector that X generates at p. Essentially, this is just another version of the directional derivative. Could we perhaps take a 'second derivative' in this fashion? Say, X(X(p)), or X(Y(p))?”

“I don't think that will work, at least not always. Keep in mind that we have no guarantee that our outputs are even differentiable, so it may not be possible to take a second derivative.”

“That's true, but there must be some way to compose vector fields.”

“Let's write out the definitions in general. If you please, Arthur?” Arthur wrote in his notation X= Σ ai  (d/dxi  ) and Y= Σ bj  (d/dxj ). “Now what would happen if we tried to apply X and Y consecutively?”

“We can't necessarily do that.”

“I know, but humor me for the moment, Arthur. Let's write out the formal sum applied to a differentiable function f.” Arthur wrote out XY(f) = X( Σj  bj  (d/dxj  ))(f) = X(Σj  bj  (df/dxj )).

“The X can certainly move inside the sum, by linearity.” Σj (X (bjd/dxj)). “And now it's a simple product rule application.” Σ j [X(bj) df/dxj \+ bj X(df/dxj)] = Σ j [Σ i ai(d/dxi)(bj) df/dxj \+ bj Σ i ai(d/dxi)(df/dxj)] = Σ j Σ i ai(dbj/dxi)df/dxj \+ Σ j Σ i aibj (d/dxi)(df/dxj)]. “Well, that's a mess and a half,” sighed Arthur. “And the last term is a second derivative, so we're still lost.”

“No, we're home free!” cried Spherius.

“How in Pythagoras's name do you think that helps us?”

“Well don't you see, Arthur? The first set of terms are the product of the first derivatives of two differentiable functions, so no harm there. And for the second, consider that our definitions of X and Y were totally arbitrary. So if we reverse the order of X and Y, it will have the effect of creating a different set of first terms but the exact same second set of terms! And then we can simply subtract them and completely eliminate all second order derivatives!”

Arthur was silent for a moment as he thought over Spherius's suggestion. It seemed too simple to be trusted, and yet the offending terms did in fact subtract away to zero. If strict composition of fields wouldn't give a new field, perhaps this odd operation was the way to go. “What would you call this new field, Spherius?”

“I was thinking of calling it the _bracket_. We could denote it by [X,Y](f) = XY(f) – YX(f). Then once we clean up the algebra, it will take the form

[X,Y](f) = Σ ij  (ai  dbj  /dx  i  – bi  daj  /dxi  )df/dx  j . So, in words, you take the coefficients of X times the partial derivatives of Y minus the coefficients of Y times the partial derivatives of X, and then multiply each of those expressions by the partial derivatives of f. Always keeping in mind that the 'coefficients' are themselves real-valued functions.”

“I must admit, there is some nice symmetry to that expression,” said Arthur. “There's some reminiscence to the chain rule, with the numerator of the inner partials matching the denominator of the outer partials. And it has some useful properties, like anti-commutativity and linearity.”

“Anti-commutativity?”

“I mean to say that if you reverse the order of X and Y, your signs will reverse. The expression has the coefficients of X minus those of Y, so reversing the order of the subtraction will reverse the sign of the difference. And for linearity, [aX+bY,Z] = a[X,Z] + b[Y,Z].”  
“Ah, yes. That should work as long as a and b are real numbers and not functions. I think it might be worth doing some calculations with these brackets to learn more about how they behave.”

We will spare the reader the tedium of the calculations Spherius and Arthur undertook, as many of them proved fruitless. However, it is worth recording two of the results that can be easily verified. First is what is known in our world as the _Jacobi Identity_ : [[X,Y],Z] + [[Y,Z],X] + [[Z,X],Y] = 0. Second, following up on Spherius's comments about linearity, they investigated the bracket of two fields with functions as coefficients. A few applications of the product rule yielded the identity

[fX,gY] = fg[X,Y] + fX(g)Y – gY(f)X.


	7. In Which Arthur Learns to Go with the Flow

Spherius watched Flatland from above and saw the silly polygons at play.

No, he told himself, that was patronizing again. He shouldn’t assume that two-dimensional beings were less intelligent than he. After all, Arthur had proven himself more than Spherius’s equal. Who knew what potential they might have?

Still, from above it was easy to think he knew all their inner lives. He could literally see inside them, after all. He was currently watching several sitting alongside a river. One of the isosceles got too close and was swept away in the current. It was not moving so fast that he was in danger, and he simply laughed as the flow pushed him along its preset course. At every point the current…

“At every point, the current is pushing him in a preset direction,” Spherius said slowly. “If every point has a preset direction, then the river is the only path he could ever take. But the river… the river is a vector field!”

Spherius shot across space, dipping under the branches of Flatland until he found Arthur Square at home with his family. “Arthur!”

The polygons jerked side to side at the noise. “Spherius, what is it? I’m having dinner with my family- have you met my daughter Sally?” He gestured at the orange pentagon.

“Yes, yes, pleased to meet you. I have a problem I wanted to work on with you.”

“Can it wait?”

“I’d really prefer to get it down on paper while it’s fresh on my mind.”

“Oh, very well. Sally, you and your mother keep eating, I shouldn’t be long.” Arthur departed his living room and went to the study at the northeast corner. “So what’s this idea you had?”

“I was looking at a river flow and I got to thinking: Doesn’t a vector field also cause things to flow?”

“Come again?”

“What I mean to say is this: Let’s say we have a vector field on a manifold. So each point is assigned a tangent vector, right? But that vector tells a particle what direction to move. So let’s say you drop a particle at an arbitrary spot on the manifold. Then the vector field pushes it in one direction, but as it moves it is pushed in a new direction? This means that the direction is continuously changing but it is completely determined by the vector field itself! So any point follows a unique path dependent solely on the vector field!”

“So it would be deterministic,” said Arthur. “Chaos says that in a realistic situation we wouldn’t be able to predict the motion fully because we wouldn’t have complete information. Something as simple as a slight breeze could utterly thwart our calculations. However, if the only motion is caused by the force of the vector field, then we would have complete information and could fully predict it.”

“Precisely! Ah, I’m glad you agree. For a minute I was afraid I was jumping to conclusions.”

“No, I think you’re on to something. You’re saying that you want to determine the path of a particle from its vector field. Actually, I think this is more familiar than you think.”

“What do you mean?”

“Generalize it fully, Spherius. The vector field gives you the tangent vector at any point- essentially, it’s a form of a derivative. You know its starting point and you want to find its position as a function of time. So what you’re saying is that you know the derivative and the initial condition of a function, and you want to find the function itself.”

Spherius spun as he thought. “Then- it’s a differential equation!”

“Bravo! And we know that while such equations cannot always be solved globally, if the function is continuous on a neighborhood of the point then there is at least a local solution. And if we have the slightly stronger requirement that it is Lipschitz continuous- that is to say, the slopes of the secant lines are bounded- then the solution is unique on that neighborhood.”

“So you’re saying we might not be able to follow the path indefinitely?”

“Perhaps not, but a local solution is better than nothing. Let’s focus on a point p and call the appropriate neighborhood of p U.”

“Don’t we normally use U for coordinate charts?”

“We do, but this time U is a subset of M itself, not R^n. Now what we want is a function whereby we input a point q near p and also a time t to represent how long the particle has been in motion. So then **phi** should map the direct product of time and starting position to a new position in M.”

“The time should start at 0,” said Spherius. “Except- wait, what would it mean to start? We might never know when the particle enters the neighborhood.”

“Yes, I think it might be better to say 0 is the time we start tracking the particle, so at time 0 it returns the same position as we input. Then we can allow t to take on both positive and negative values.”

“All right, so we allow an interval of possible times- say ( **-delta** , **delta** ). Then our mapping is

**phi:** ( **-delta** , **delta** ) x U -> M. Then if, for instance, I pick a point q and a time of 10, the function tells us where q ends up after ten seconds.”

“That should be right,” said Arthur, “provided 10 seconds is in the domain. So if we fix the point q, we get a function of a single real variable: t-> **phi** (t,q) that traces the path of q. Then the derivative of **phi** should be the vector field itself. Well, I should say it’s the action of the vector field on the function **phi**. d **phi** /dt= X( **phi** (t,q)) and **phi** (0,q) = q is the initial condition.”

“What if we approach this differently?” asked Spherius. “We can call one of these curves a _trajectory_. I contend that we could use the vector field to generate any number of trajectories by simply choosing a starting point q. If I were to drop a pebble at a specific point in the river, its path would be determined entirely by where it started.”

“I see. So the vector field is a function from points of U to the set of trajectories in M- each point returns an entire path, not just a point of it. If that’s the case, then t is no longer an input yet it’s still implicit in determining the trajectory. So let’s define **phi** t(q) as the curve generated by the initial condition q. Then **phi** t(q) = **phi** (t,q) and **phi** t : U-> M. You were referring to how things flow in the river, so let’s call this function the flow of X. Or, I suppose since it’s only defined on the neighborhood U, we’ll call it the _local flow of X_ in the neighborhood U of p.”

“Why don't you go for a ride, Arthur?”

“Excuse me?”

“The river. I just thought of something. You see, I'm curious how the temperature changes as you follow the flow. Do you have a thermometer?”

“Of course.”

“Then I'd be interested in seeing what kind of readings you get. I'm going to call the temperature function f, and I'm going to push on Flatland as you ride down the river. You see, we could have two different vector fields acting on the same manifold, and I'm not sure how they would affect one another.”

“So you might push perpendicular to the flow of the river?”

“Something like that. But while our vector fields can be applied to points, it may be more profitable to apply them to an entire curve, such as **phi** t, the flow of the river.”

“Hold on, Spherius, you're losing me. What does temperature have to do with this?”

“Temperature is a single real number, so your thermometer could be seen as a function from Flatland to the real numbers. Then if we apply two apparently different vector fields to a single function and get the same result, we can say they are the same field.”

“All right, I'll give it a shot. I wasn't looking forward to getting wet today, but fine.”

“Excellent. What I want you to do is measure the temperature, then mark the new temperature as frequently as you can.”

“I can set this thermometer to automatically record the temperature every tenth of a second.”

“All right, let's get to it.”

Arthur took a deep breath, activated the thermometer, and jumped in. To him, the water was quite cold, but he reassured himself that more precise measurements were going to be helpful. As he rode the current east, he felt a gentle push in a more northerly direction. Spherius, no doubt. After about twenty seconds he was back on the shore. He shivered a bit until a towel appeared in front of him, which he pulled along his perimeter.

“Now, let's look at that data,” said Spherius. “I suppose we could leave the temperatures as they are, but since we're more interested in change, let's subtract off the initial temperature so we start at zero. Then we can define the change in temperature as h(t,q) = f( **phi** t(q)) – f(q), so the first term is the temperature after t seconds in the water and f(q) is your initial temperature.”

“I want to rewrite this,” said Arthur. “Since we're talking about the changes in the output of a function, we should be able to divide by the change in input, i.e. time, to give us the average rate of change. So let's say h(t,q)/t = [f( **phi** t(q)) – f(q)]/t, and call this new function g(t,q). Then g is the average change in temperature per second over the first t seconds, and eventually we should be able to shrink t to zero in order to find a derivative.”

“Are you sure that g is differentiable at all?”

“It is if we define it properly. A simple change in variables is all it takes to verify. Then taking the limit as t goes to zero will give us the derivative of h.”

“Let me check that calculation,” said Spherius. “Yes… yes, a quick application of the product rule shows us that since h(t,q) = tg(t,q), then the derivative dh/dt at zero is simply g(0,q). Let’s reiterate, because we have a lot of variables floating around. So f is a function on U that gives the temperature at any point, while h and g are two-input functions, where h gives the total change in temperature of a path starting at q and ending t seconds later, and g gives the average change in temperature over the same time period.”

“We can write f o **phi** t(q) = f(q) + tg(t,q). And since g(0,q) is the instantaneous rate of change in the temperature at time 0, it also represents the action of the vector field on the temperature function. So g(0,q) = Xf(q), which should be true at any point in the manifold. Now we need to bring in the second vector field Y, the one that represents your push, and I’m afraid it’s going to get messy.”

“Can we apply a vector field to this function?” asked Spherius. “Its range is in the real numbers, after all, not points in the manifold.”

“Good point. Let’s think some more about the equation we just saw: f o **phi** t(q) = f(q) + tg(t,q).” The two stared at it for a moment. “Wait, I think I see something.”

“What?”

“The left side of the equation, it represents the final temperature, right? And f(q) is the initial temperature? Well, since g is the average change in temperature multiplied by time, this is just another type of linear equation, with g representing the slope!”

“Ah, a good old-fashioned y=mx + b problem. Everyone loves those. Or, rather, g is a function, but if we let t approach zero then g becomes a linear map.”

“Right, now what I want to do is apply Y, the push. f o **phi** t(p) is a function that takes in a point and gives the temperature at that time. We can apply the differential map d **phi** tY to it, and that gives us a linear approximation of how the temperature is changing in the Y direction.”

“I think I see it,” said Spherius. “There is a vector saying how fast the temperature is changing along the flow **phi** t at p, but the differential map will take that vector and give a new one that tells us how fast the temperature is changing in this new direction. And that new direction won't be exactly the one determined by Y, but something that represents their combined action. Another way of writing that would be Y(f o **phi** t)(p), since it takes p and a function through p, and applies a vector to yield a new vector.”

“Yes, and that’s perfect! We can go back up to our previous equation, since its left side is exactly what you just said!”

“Ah-ha! So we have ((d **phi** tY)f)( **phi** t(p)) = Y(f o **phi** t)(p) = Yf(p) + tY(g(t,p)). Then subtract that middle piece from both of the other pieces and change the signs, and we’re left with Y(f o **phi** t)(p) - ((d **phi** tY)f)( **phi** t(p)) = Y(f o **phi** t)(p) - Yf(p) - tY(g(t,p)).”

“Oh, we're so close I can taste it,” said Arthur. “No, no, calm down. I've found getting excited about finishing a problem can distract one from the process of solving it. We have a multiple of t on the right, but the whole reason we created the g function was so that we could find a derivative. So let's factor out the f o **phi** **t** out of the left side. Well, technically 'factor' is the wrong word, but we'll take advantage of the fact that Y is a linear operator. Then divide out t from the equation:

1/t[Y - d **phi** tY]f( **phi** t(p)) = [Y(f o **phi** t)(p) – Yf(p)]/t – Y(g(t,p)).”

“That left side is precisely the average rate of change of Y combined with **phi** t, so we can take the limit to get a derivative:

lim t->0  = 1/t[Y - d **phi** t  Y]f( **phi** t  (p)) = lim  t->0  [Y(f o **phi** t )(p) – Yf(p)]/t – Y(g(t,p)).”

“What's next?” asked Spherius. “Let's see, on the right hand side we have a fraction which is Yf evaluated at very close points along **phi** t , then divided by the change in time. So that's a derivative of Y along the flow of X, which means we can rewrite the entire fraction as X(Yf)(p), or XY(f(p)).”

“And the right term, we already saw can be expressed as Y(Xf(p)), or YX(f(p)). So...” The two were suddenly silent as the same thought occurred to them.

“But that's the bracket!” shouted Spherius. “The entire right side reduces to XY-YX of f at p, or ([X,Y]f)(p)!”

“So if we want the instantaneous change in temperature induced by two separate vector fields, we can simply compute the bracket at that point.”

“This will greatly simplify our future calculations! I had been worried about the difficulty of computing these combined flows, but now I know we have a simple formula!”

“One thing to be careful of,” warned Arthur, “is that the bracket is not commutative. In fact, it's anticommutative, meaning if we reverse the order then we get the opposite sign.”

“So you're saying that if we apply X and then Y and find an increase in temperature, then reversing the order will give us a decrease in temperature. That is certainly worth remembering. All right, it's been a long night, and I should let you get back to your family.”

“Actually, there is one other thing I needed to discuss with you.”

“Oh? What would that be?”

“The lights in Flatland. I wanted to talk about where they come from. Different spots have different brightnesses and there's no clear origin. What's more, if I turn on a light in Flatland, it spreads in all directions and gradually fades. However, many of these mystery lights illuminate a circle or other shape and then simply drop off at the boundary. I was wondering if this had anything to do with your 'above.'”

“It does indeed. Sometimes we have lights on to help us work, and some of what you see is starlight intersecting with Flatland. We could get a continuous path of light, but if we shine one orthogonal to Flatland then it passes right through it.”

“I see. Topologically, we call the area that is lit by your light the _support_ of it. That is to say, we could see the amount of light at any given point as a function whose support is all the points that receive a non-zero amount of light. And since the circles have a hard barrier, we can say the function is _compactly supported_. What I want to know is what would it take to light all of Flatland to the same level?”

Spherius made a loud choking noise. “Flatland is infinite, so we would need an infinite number of lights to do this thing.”

“That's true, but if we take a small neighborhood around any point, there would only be a finite number of lights that shine on that neighborhood, correct? In other words, the set of lights would be _locally finite_ , since we can always find a neighborhood where only finitely many of them are relevant.”

“I suppose so, though I don't see why that's relevant.”

“It's often useful to be able to reduce an infinite case to a finite one. So let's think over what exactly we're defining here, and make sure it's compatible with our ideas of differentiable manifolds. We want a bunch of real-valued functions to represent the light cast, and we want them all to go up to a maximum amount of light. Let's say 0 is no light and 1 is the maximum. We could never have a negative amount of light, obviously.”

“All right,” said Spherius, “then we can denote these functions as fn , where each one maps points in M to real numbers.”

“I'd prefer to use fα  , since n is suggestive of countability. Also they should obviously be differentiable, and we'll have a differentiable structure {(Uα  , **x α **)} for this manifold, so the U's cover the entire manifold.. Then to get this compactness condition, we'll insist that the support of each function is inside a coordinate neighborhood. In other words, I can encircle one of these light beams and choose coordinates with only finite values.”

“And you want them to combine to give the same value 1 at every point. That means the sum of all the functions evaluated at each point will be one. But there are an infinite number of them...”

“Ah, but that's why we insist that the set be locally finite. That way each at any individual point we are only summing a finite number of non-zero terms.”

“Brilliant! Maybe this has value after all. What do we call it?”

“Since we're essentially coming up with a way of adding a bunch of functions to make the number 1, I would call the set of {fα  } it a _partition of unity subordinate to {V_ α  _}_.”

“Do such partitionxαs always exist?”

“That's an interesting question. But we didn't make any particularly odd assumptions, so I would guess that as long as our topological space isn't too strange we can find it.”

With more research, Arthur wxαould eventually discover that a partition of unity exists on any differentiable manifold as long as each connected component is Hausdorff and has a countable basis. However, that is a story for another day.

xα

**Notes for the Chapter:**

> Yes, Arthur has a pentagonal daughter. I frankly hate the idea of all "women" in Flatland being line segments who are naturally stupid and violent, so I'm ignoring that. I'm sure Abbott meant well, but this is my version of the story.


	8. On Metrics and Manifolds

Arthur began laying down the coordinates in the neighborhood he’d sectioned off for today’s experiments. The x-axis was marked 1, 2, 3, with negatives going in the opposite direction. He then labeled the y-axis at a right angle. At least, it appeared to be a right angle. The literal-minded Spherius had tried to insist on using a protractor, but Arthur had refused. The coordinates were the image of a map from R^2, so whatever system they set down was valid. There was no inherent scale, only what they chose to use. Besides, from Spherius’s point of view, the right angle might not be a right angle anyway. He saw curvature in space that Arthur couldn’t imagine. To him, a right angle might be 100 degrees, or a triangle’s interior angles might sum to 300 degrees. The subjectivity of their measurements was hard to accept, but that was exactly why they did these experiments. He marked the origin “p,” so p was the name of this spot in Flatland and (0,0) were its local coordinates. Arthur pictured a perfect circle in some Platonic ideal of a geometric plane, while his crude sketch was merely its image. A shadow cast from the world of the gods onto the world of humans.

“Are you ready, Arthur?”

“Huh? Oh, I’m sorry, Spherius, I was a bit lost in my thoughts. Yes, we can begin.”

“All right. If we’re going to perform mathematics on manifolds, we will need some way of measuring things. Since your coordinates are arbitrary (you could use meters or feet, for instance), there’s no inherent distance function on Flatland, but you can apply the standard Euclidean metric to any system of local coordinates. What’s more interesting is the tangent space.”

“I agree. We need some sort of norm for vectors, some way of measuring their length. I’m trying to decide what’s the best way to approach it…”

“Perhaps the dot product?” Arthur paused at Spherius’s words. “The dot product of two vectors in R^n gives the product of their lengths, divided by the cosine of their angle. Then you can find the length of a vector by dotting it against itself and taking the square root. Could that be generalized?”

“It’s an idea,” answered Arthur. “Let’s say our manifold was precisely R^n with the standard basis. Then any two basis vectors have the Kronecker-delta as their dot product, so we always get either 0 or 1. Since the dot product is bilinear, we can simply multiply by the respective lengths. We generalize these to the idea of an _inner product_ , which is symmetric, bilinear, and positive-definite. Well, technically it’s conjugate symmetric, but since we’re in the real domain that’s irrelevant.”

“Exactly what do you mean by ‘positive definite?’”

“Simply that the inner product of a vector v with itself is always positive for non-zero vectors, and zero if and only if v = 0.”

“So your dot product example was about finding the inner product of basis vectors.”

“Right, but that’s where I’m stuck,” said Arthur. “I have my two basis vectors, but they’re orthogonal, so I can’t see how I could generalize this any further.”

“But they aren’t- oh, of course. You can’t see it.”

“See what?”

“That your vectors are not truly orthogonal, at least not in Spaceland. Here, let’s pick a point in your neighborhood and I’ll let you enter the tangent plane.” The familiar glow started a few paces from p. “We’ll call that point q. Now throw one of these pebbles at it in each of the cardinal directions.”

Arthur walked around to the west of the point and threw a pebble, then to the south and threw one from that direction. “Yes, of course.”

“What is it?”

“You see, to you it appeared you were throwing the pebbles in the directions (1,0) and (0,1). However, out here they went in the directions (2, 1, -1) and (1, 3, 1), so their dot product is 4.”

“I’m having trouble trying to visualize this, Spherius.”

"Here, I’ll give you an example. Draw a curve on a Cartesian plane.” Arthur looked down and sketched a curve in the dirt. “Ah, an exponential function. Good choice. Now, you agree with me that the curve is a one-dimensional manifold, correct?”

“Of course. Any segment on the curve could be written (t, e^t), so the segment is a function of t alone.”

“Right, so what is the basis of the vector space?”

“What vector- oh, I see. The real line is a one-dimensional vector space, so any non-zero vector is a basis. The standard choice, then, is v=(1).”

“Right. My first instinct was to say (1,0) and (0,1), but to a creature living on this curve, it would only have to choose to go forward or backward. If it walked a distance of 1, then x and y would change by… well, I suppose the curve length formula could be used. We’ll come back to that.”

“I think I’m starting to see where you’re going,” said Arthur. “Let’s say a bug was running along the curve to the right, and then it broke free at the point (1,e). Then it would travel along this tangent line.” Arthur sketched a line breaking away from the curve. “The instantaneous velocity at that point is e, so its new vector would be a scalar multiple of (1,e), and I can find its length with the standard Euclidean norm. |k(1+e)| = k*sqrt(1+e^2). Then k represents how fast it was moving along the curve before it broke free.”

“That seems right. This is perhaps not the only way to define length, but it seems valid. I’ll admit it troubles me that the tangent vector should be in two-space when the manifold was in one-space-“

“Ah, but you forget, Spherius, that the tangent space is still a line. To the bug, it would simply look like it entered a new line, just like I see a new plane when I leave Flatland. So the tangent vectors fill out a new line which is straight, though it differs from the curve. Well, depending on…”

“Depending on what, Arthur?”

“Something just occurred to me. So we have the curve traced by y=e^t, and its slope at some point q is m. Then any bug leaving the curve at that point will be on the line sketched by the vector (1, m) leaving from the point q, and the length of the vector will be adjusted by multiplying by the speed.”

“I’m with you so far.”

“But what if the slope were very small? Well then, the tangent vector’s length would be k*sqrt(1+m^2) which would be almost k itself. So what we would be saying is that the length of the tangent vector is almost the same as the length traveled on the original curve.”

“Then that means… I suppose that means that the curve and the tangent vector are almost the same. Well, that makes sense, as a point where the curve is mostly flat would have a tangent vector that stays close to the curve.”

“And if it were entirely flat- i.e., a straight line- then its tangent vector would be exactly the same! This means we have discovered a way of determining the curvature of a manifold- we simply compare a vector in the space with the tangent vector starting from the same point!”

Spherius whirred quietly as he thought over those implications. “That is undeniably worth exploring further. We’ll come back to it later, but I think you’re right in principle. I can see points in Flatland where tangent vectors would barely depart the surface at all, and others where they would quickly rocket away. It’s all because Flatland is embedded in Spaceland.”

“Interesting… but Spaceland must be embedded in the tesseract. I wonder if all manifolds are embedded in R^n… well, I suppose it doesn’t matter for the time being. What does matter is that a manifold can be any finite dimension n, and if that’s the case then the tangent spaces are also dimension n. So at any point q in the neighborhood M, we could define an inner product <u,v>q.”

“And that would be the dot product?”

“Not necessarily, Spherius. There’s a problem, you see. You could calculate the dot product because you were outside Flatland, but I couldn’t see their three-dimensional images. For that matter, your measurements of vectors in Spaceland don’t take into account the fourth and higher dimensions. Since we aren’t relying on an ambient space, we can’t say what the ‘true’ inner product is, so we’ll just say that any inner product that is well-defined on the neighborhood is a metric. Well, that may be confusing, since it would lead me to think of this as an inner product on points in the manifold, when it's actually defined on the tangent space. So we'll call it a _Riemannian Metric_.”

“We should be careful, though- we want everything to be differentiable when possible. Our idea is that we take vectors in Flatland and then let them leave tangent to Flatland at a specific point, then we take the inner product of those new vectors and get a real number. We’re essentially saying that we map a point to a pair of vectors and take their inner product. But mapping points to vectors in the tangent space is exactly the definition of a vector field. So given two vector fields X and Y, <X,Y>(q) is a function that takes points of Flatland to real numbers.”

“Not bad, Spherius. That should work nicely.”

“Yes, but the problem is we have to make sure the function <X,Y>(q) is differentiable regardless of which vector fields we use. That could be far more calculations than are practical.”

“I don’t think it’s as bad as that. Remember, inner products are linear, so if we define a property on the basis it should hold for everything in the vector space. We have a neighborhood U in R^n and a map **x** that sends U to a neighborhood in M. So let’s stick with the standard basis for R^n: {(1,0,0…,0), (0,1,0,…,0)…(0,0,…,1)}, which are also written {e1,e2,…,en}. Then the differential map d **x** q will take the basis vector ei to a new vector in the tangent space at q. Call it d/dxi(q). Now if we just fix two of these basis vectors, say <d/dxi(q), d/dxj(q)>q, then this should vary smoothly with q.”

“I see. So you would create a new function, let’s call it gij(q), where q is some point in U. This function will take any point in U, find the images of the ith and jth basis vectors in the tangent space, and then return their inner product. And as long as all of these gij functions are differentiable, then we can be sure the same will be true for more complicated vector fields. So we can call the pairing of the manifold and its metric a _Riemannian manifold_ , and gij is the local representation in the coordinate system.”

“That sounds good to me,” said Arthur. “Next up is to establish when two metrics are equivalent, or I suppose a better word is _isometric_. We’ll take a function f:M->N that’s a diffeomorphism between Riemannian manifolds. Then if we find the distance between vectors in M at a point p, we should get the same distance when we evaluate the inner product of their images under f.”

“Or, to be precise, since the vectors are in the tangent space, we apply the differential of f at p to get the new vectors.”

Arthur chuckled. “And I thought I was the pedantic one. All right, so given p in M and u, v in TpM, we have <u,v>p = <dfp(u),dfp(v)>f(p). And I suppose we should allow for _local isometries_ , where there’s some neighborhood U where f:U->f(U) is an isometry.”

“Do we have any good examples of Riemannian manifolds? I suppose the dot product on R^n, like we said earlier, but that’s almost trivial.”

“I think a differential map into a Riemannian manifold will induce a metric. That is to say, suppose f:M->N and N has a Riemannian structure. Then we can just create an isometry where <u,v>p = <dfp(u),dfp(v)>f(p). I like to use pull-backs rather than push-forwards whenever possible; it tends to make the topology easier to work with.”

“That sounds good- oh, wait. How do you know it’s positive-definite? What if you have a non-zero vector where the inner product with itself is zero? That could happen if the pull-back map isn’t well-defined, so a bunch of different vectors pull-back to zero.”

“You’re right… yeah, that can’t work in general. I think we have to require that the function f be an immersion; that way df is one-to-one. Then we know the metric on N is positive-definite, so the only vector whose norm is zero would be zero itself, and the only thing it pulls back to is the zero vector, meaning it’s also positive definite.”


	9. How Velocity was Derived and Metrics were Defined

“There are a few loose ends I'd like to tie up,” said Arthur. “We've talked about vector fields and about curves. I wanted to look into a vector field on a curve specifically.”

“What made you come up with this idea?”

“Your talk about the dot product. Yesterday, we started our discussion of the Riemannian metric with using dot products to find lengths. It occurred to me that in vector calculus, we can find the length of a curve by integrating the norm of its tangent vectors along the curve, and the norm is simply the square root of the dot product. In that case, we should be able to define arc length similarly on a manifold, simply replacing the dot product with our inner product.”

“I see. Well, let's start with a curve. I believe we've defined it as a differentiable function from an interval I into M?”

“That's right. Now it shouldn't be too hard to define a vector field along a curve. We'll just say that c:I->M. Normally we'd have V map points of M into the tangent bundle, but in this case we can pull back to I itself, and simply have V take real numbers to tangent vectors.”

“Symbolically, then, V(t) is a vector in Tc(t)M. Is it differentiable?” Spherius muttered. “What would that mean, that it is the subset of a differentiable vector field on M?”

“I don't think that's a good path to go down. I'm not sure we can count on being able to extend this field, nor would we want to. Remember, we've been trying to define manifolds without respect to a larger space, so that feels like the right way to approach vector fields as well. Let's think about its actions on functions instead.”

“Very well, Arthur. Last time, we were using temperature as a stand-in for a generic real-valued function. So we measure the temperature along a curve in M, say that road you're standing by.” Arthur took out his thermometer and began walking down the road. “Now at every point the temperature is changing, so we can assign a numerical value that represents how quickly the temperature is changing. So t is a time, and V(t) gives a vector that acts on a function. This means... yes, I think I see it. This means that V(t) is defined by how fast you travel the path, so that if you broke free of Flatland you would enter Spaceland at a vector in the tangent space to the point you left. Then those tangent vectors applied to f give how quickly the temperature would change if you continue into Spaceland.”

“That sounds right to me. So the vectors you describe, the ones in Spaceland, are the _velocity vectors_ , and they can be collected into the _velocity field_ of c.”

“Should we label them dc/dt?”

“What would that mean, exactly?” Arthur thought for a moment. “We said the vector is dependent on my speed, correct? Well, since the curve is one-dimensional, that means my speed is a single derivative d/dt, and that one derivative is a vector that generates the one-dimensional tangent space (aka the tangent line). Then we can apply the differential map dc to send that basis vector into a basis vector for the tangent line in Spaceland. So dc(d/dt) = dc/dt.”

“Excellent. And now we can define the length of a curve as a definite integral from elementary calculus. Say [a,b] is a closed subset of I, call it a _segment_. Then the length of the image of [a,b] is

l a  b  = integral  a  b  [sqrt<dc/dt,dc/dt>]dt. I think we've settled that subject. We can now define arc length for any manifold with a Riemannian metric, which of course all of them have.”

“Why do you say that?”

Spherius was quiet as he spun on his axis. “Well, it's obvious, isn't it?”

“Those are dangerous words. I have often explored things that seemed obvious yet turned out to be untrue. We should try to prove that a Riemannian metric exists.”

Spherius sighed. “I should know better than to open my mouth. All right, how do we go about that?”

“Well, we can certainly define a Riemannian metric on a specific neighborhood V, correct?”

“Our definition allows it on any coordinate neighborhood, yes.”

“Then we can just define a new metric by summing the old ones. We'll say <u,v> p  = Σ<u,v> p  α , summing across all possible metrics on that neighborhood.”

“That seems too easy,” said Spherius.

“Don't overcomplicate things if you don't have to.”

“But we may have to. Who's to say that sum converges?”

“I- oh, you're right. How could we... maybe there's a way to force that sum to be finite. Let's say we have a differentiable partition of unity on M called {f α  }, subordinate to some set of coordinate neighborhoods {V  α  }. I suppose we never proved those exist, either, but supposing there is one- shine your lights through Flatland like before. Then we can multiply each inner product by the value of the f functions that correspond to the neighborhood. So <u,v> p  = Σf  α  (p)<u,v> p  α .”

“I fail to see how making the sum more complicated helps us.”

“Ah, but you forget, one of the key facts of a partition of unity is that it is locally finite, meaning all but a finite number of those f α (p) are non-zero. Thus, our infinite sum becomes finite.”

“I see. So we'd start with a partition of unity, and then we'd define a new inner product at each point as the sum of a finite set of other inner products. Well done, Arthur.”


	10. How Arthur and Spherius Made a Connection

**Summary for the Chapter:**

> To take derivatives, you need a chain rule. To do that, you need an affine connection.

Arthur was out in the wilds waiting for Spherius to arrive when he started to feel warm. It wasn't particularly unpleasant, but it was noticeable and its source was unclear. He stepped to the side and felt the heat disappear. It was apparently only in a small neighborhood-

“Arthur!” shouted Spherius. “Arthur, are you all right?”

“I'm fine, Spherius, what do you mean?”

“You didn't notice the weight on you?”

“No, I didn't feel any southward pull-”

“Downward, not southward. I accidentally set an object on Flatland and it was pressing downward. You didn't feel it?”

“I felt a bit warm, now that you mention it.”

“I see. And it was only you and the things around you?”

“Yes, it seemed localized. I only had to step a few feet away and it was gone.”

“This is something I hadn't really considered.”

“What would that be?”

“Well don't you see, Arthur? Up until now, we've thought of vector fields as defining motion through a manifold. But what if a field pushes in such a way that all the force is wasted? No work is done, save for the heat generated by pushing against the adhesive force of Flatland. Then every spot in the neighborhood is pushed in a parallel direction, so we can call this a _parallel vector field_.”

“Really? It seems to me that, since the vectors are perpendicular to Flatland, it should be called a perpendicular vector field.”

“I discovered it, so I get to name it.”

“All right, all right. So how exactly are you defining this?”

“I think we need a new kind of derivative.” Spherius and Arthur both thought for a moment.

“You mentioned 'work' a minute ago. If I remember my physics, that means we are projecting a force vector onto a vector that represents motion. In Euclidean Space, we'd use a dot product, so we'll probably need to substitute the inner product of a Riemannian metric.”

“That sounds like a good start. We also should ensure that it meets the standard sum and product rules of differentiation. Draw a path, Arthur.” Arthur sketched a slightly winding curve on the ground. “Now, hold on to it and I'm going to push you along.” Arthur gripped the rough line as best he could as the push began. It caused him to speed up and slow down as he approached the curves. His speed was constantly changing- hopefully this new derivative would allow them to calculate acceleration. The pressure lightened until he came to a stop.

“Thank you for letting up, Spherius, I was starting to feel ill.”

“But I haven't let up, Arthur. I'm still pushing.”

“Then why am I not moving?”

“Simple: I pressed this entire neighborhood in the same direction, and you just reached a point where my field is perpendicular to Flatland. So all the force is wasted, causing you to feel zero work. Well, except maybe the heat. I'll let up now.” Arthur felt it get a bit cooler. “What we see here is that my pushing represents a vector field along this curve, but it would be valuable to subtract out all of the wasted force. So let's say we have a vector field V along a differentiable curve c:I->M, and define the _covariant derivative of V along c_ to be the projection of the vector field onto the curve.” (Figure 10.1)

“I'm with you so far,” said Arthur. “We'll use a capital D to distinguish from the standard derivative, so DV/dt. And if we sum two vector fields, we should be able to project them both as D/dt(V+W) = DV/dt + DW/dt. In other words, if you and a friend push me at the same time, I could compute your forces separately and add them. We also need to see how it interacts with real functions, like our temperature measurements.”

“All right, so let's again say f is temperature as a function of time, and V is the vector pushing you along the path c. Then we can say fV represents a new vector field, one that gives the instantaneous rate of change of the temperature in the direction chosen by V.”

“But V might be leaving Flatland, so we want to project this vector onto the curve,” said Arthur. “So then D/dt, the covariant derivative along c, must have some predictable interaction with fV. We'd expect it to be a sort of product rule: D/dt(fV) = (df/dt)V +f(DV/dt). Does that make sense? Let's see, df/dt and f would both yield scalars at any given point. So the first term is the rate of change of temperature along the curve times the vector V gives, and the second term is the temperature at the point times the push of V along the curve.”

“That’s good as far as it goes, but I’m also concerned with the connection between the curve and the space it sits in.”

“What do you mean?”

“I know you’ve emphasized time and again that we should be able to evaluate a manifold in itself, but if it does sit in a larger manifold then there might be vector fields on both of them. We need to be careful that the relationship between the vector field on the curve and that on the larger manifold do not contradict each other.”

“I see,” said Arthur. “So if we have a vector field Y on the larger manifold, and a field V on the curve, then DV/dt must respect their relationship. I guess V is the composition of Y and the curve c, so V(t)=Y(c(t)).”

“So we must have something akin to the chain rule!” shouted Spherius. “DV/dt = (DY/dc)(dc/dt).”

“That seems intuitively right,” said Arthur. “Yet I’m not sure about that notation. We haven’t yet defined how to differentiate a vector field with respect to a curve. So instead of DY/dc, we need to create some new structure to describe their connection. Let’s call it Deldc/dtY, where Del will be called the _affine connection_ between the vector fields.” *

“Then this ‘connection’ of yours takes the two vector fields, one on a curve and the other on the space as a whole, and it returns a new vector field which is the covariant derivative. But what is it, really?”

“You just said it. It’s a map that takes two vector fields and returns a third one. Let’s denote the set of differentiable vector fields on M as X(M), and the set of differentiable real-valued functions as D(M). Then Del:X(M) x X(M) -> X(M). This might even be linear, since derivatives are linear.”

“It’s actually starting to remind me of the gradient. Almost as though someone picked that symbol for a reason,” said Spherius suspiciously.

“I’m sure I don’t know what you’re talking about,” said a smiling Arthur. “Explain it.”

“Well, the gradient is an operator on differentiable functions. Essentially, it takes a function at a point and returns a vector, specifically the vector along which f has its greatest increase. So the gradient is itself a type of vector field on a Euclidean space.”

“But d/dt is also a vector field, albeit a very simple one. So we apply a vector field to a function to get a new vector field.”

“This is going too fast,” said Spherius. “We need to hammer out exactly what the connection is. So let’s say my pushing on Flatland is Y, and some other vector field is X.”

“We’ll say X is the natural gravity of Flatland,” suggested Arthur. “There is always a slight pull in the southward direction, as the rain falls, though it is weaker in temperate climates.”

“Yes… yes, that might do it. Let’s find a storm. Ah, I think I see one. Here, I’ll make you a shortcut.” A light appeared near Arthur. He walked through it to enter the familiar blue and gray of the tangent plane, then passed a few hundred feet until he passed through another light. Immediately he felt the gentle pelt of rain falling southward. “There we are. X is the pull of gravity, so each raindrop is following a flow, as we called it, determined by X. At any individual point, we can use gravity to determine the direction a raindrop will fall.”

“Chaos suggests that we have incomplete knowledge of the forces acting on the raindrop, so our prediction of the motion will be imperfect, but it should be reasonably accurate in a local neighborhood.”

“Certainly I’m not claiming any physical model is flawless. All models are wrong, but some are useful. ** Now I push from the outside, like so.” Arthur felt a slight shove that caused him to lose his balance, and the rain drops gusted about 30 degrees off of their original path before settling back to their southward motion.

“That was a sudden shove. Let’s try something a little more sustained.” Spherius got out a small motor he had brought with him. Arthur couldn’t see as Spherius gently placed the motor against Flatland and turned it on. It exerted a slight force along a circle with roughly ten times the diameter of Arthur. They watched as the raindrops settled into a new path. They fell southward into the circle, then eased slightly eastward, speeding up as they fell. When they left the circle, they maintained their eastward momentum, which meant they were falling almost southeast as they faded into the distance. (Figure 10.2)

“I know the man who lives in that direction,” said Arthur. “He’s going to be very confused why the rain is falling sideways.”

“You can explain it to him in due time.”

“Assuming people believe my stories of the third dimension this time.”

“I can help you with that. For the moment, we should focus on explicating the mathematics of these fields.”

“Agreed. Is your motor generating a uniform force?”

“No, it’s angled, so the force is strongest at the north and weakest at the south.”

“I see. That means you have an eastward, or positive, force that decreases as you go south. That means the connection DelXY is negative!”

“What you’re saying, I think, is that this connection DelXY measures how quickly Y changes as we pass along X. Y at any point gives the instantaneous velocity produced by my motor, while X is the velocity of the rain traveling south by gravity alone. We could attach a vector to each point that would illustrate how strong the force is at those points. But then… ah, I see it. If I place an eastward vector at one point, say here.” A green light in the eastward direction appeared over a spot in front of Arthur. The light did not seem to disturb the motion of the raindrops. “Then I place another one a foot to the south.” Now an orange light appeared, slightly shorter than the green one. “Since the southward direction is the natural force of X, gravity, then we could calculate the average rate of change in Y by simply dividing the difference in the vectors by the distance between their origin points.”

“Except that that’s not quite right,” said Arthur. “The path of the rain drops isn’t due south anymore. Your motor changes their path, so Y changes X and then X changes Y right back.”

“Exactly! The path of the raindrops is not fixed. But if we allow the second vector to get closer-“ the orange light moved towards the green one, getting gradually longer as it did. “Well, the closer it is, the more our calculation would represent the true change in Y along the path of X. So if we take a limit…”

“By Gauss, you’ve recovered a derivative,” said Arthur in awe. “This connection truly is the derivative of one vector field with respect to another.”

“So we have it. DelXY is the instantaneous rate of change of Y along X, evaluated at any point.”

“And that fits what we said earlier, since every point gives a vector, so we have in fact created a new vector field. Then the vectors will all be pointing roughly westward, since the eastward force of your motor is decreasing as the raindrops follow their path.”

“Since it’s a derivative, it should follow all the familiar properties of derivatives,” said Spherius. “Linearity, for starters. DelX(Y+Z) = DelX(Y) + DelX(Z).”

“And if we have two dependent vector fields with scalar multiples, it should be linear in those terms as well. So DelfX+gY(Z) = fDelX(Z) + gDelY(Z). Note that f and g are not constants but scalar functions, since they return real values at every point.”

“Would that be true in the second entry as well?”

“I think not,” said Arthur. “In that case, we would be differentiating a product of two functions, one scalar and one vector. So we need a version of the product rule: DelX(fY) = fDelX(Y) + DelX(f)Y. But since f is a real-valued function, DelX(f) is simply X applied to f, so DelX(fY) = fDelX(Y) + X(f)Y.”

“That seems fair. And now that we know what a connection is, we can finalize our definition of the covariant derivative. It is simply a way of determining the net action of a derivative in space within the manifold itself, via projecting the vector into the manifold.”

“I’d like to be a little more precise, mathematically,” said Arthur. “Our definition works best along a curve c in a manifold M, where c is parametrized by t and V is a vector field along c. Then the vector field DV/dt is a new vector field along c, where D/dt(V+W) = DV/dt + DW/dt, D/dt(fV) = (df/dt)V +f(DV/dt), and if V is induced by some larger vector field Y that is applied to the whole of M, then DV/dt = Deldc/dtY.”

“I see. That last line is saying that Y changes as you follow the path of c, and that should be the same thing as the projection of V onto c. It all fits. In theory, anyway. I’d like to make this a little more concrete, and that means we need to use local coordinates.”

“Sounds like a plan.” Arthur began marking points on two axes in the neighborhood, a ritual that was becoming quite familiar. “I suppose in this case, X is a constant, since rain falls at a constant speed.”

“It does?” asked Spherius. “Does gravity not cause it to accelerate?”

“Technically it does, but since we are in such a temperate climate the force is quite weak. I can feel a slight tug southward, but obviously I am able to overcome it, or else everyone in Flatland would fall to the South Pole. So we can approximate X as being identically (0,-1), or 0d/dx + (-1)d/dy.”

“Y, on the other hand, varies with position, mostly in y but slightly in x. With the coordinates as you have drawn them, it comes out to roughly (3(y+10) +1/2x)d/dx + (1/5 e^(-1/y^2)+1)d/dy. You can see that positive values of y will lead to a stronger eastward force than negative ones.”

“What does the coefficient of d/dy represent?”

“There’s a slight upward push of the motor that I couldn’t get rid of. It’s never particularly strong and dies off exponentially, but it is there.”

“So if that’s the case, we can then compute DelXY explicitly,” said Arthur, scratching out some calculations. “We’d get 0 DelxY + (-1)DelyY. Then that second term expands to -1Dely((3(y+10) +1/2x)d/dx + (1/5 e^(-1/y^2)+1)d/dy) = -1(3d/dx + (-2y/5) e^(-1/y^2)d/dy).”

“So the rate of change of the motor’s force at any point along a raindrop’s path is determined by that expression. That shows that the eastward push is always decreasing at a rate of 3 while the vertical force varies based on the y-value. And it’s negative because it’s always in the opposite direction of the rain’s natural movement.”

“I’m feeling good about this. We need to generalize to an n-dimensional manifold. So let’s start with a system of local coordinates (x1,…, xn) about p. Visually, I’m just assigning each point in the neighborhood a set of real numbers, but formally I mean each point on the manifold is an n-dimensional object, where each coordinate is determined by a function on U, a local neighborhood of R^n. We’ll reuse our earlier convention, and say Xi is the image of the ith basis vector in U, carried over to the tangent space at the point p. So then two different vector fields are both linear combinations of the basis vectors Xi. X = ΣxiXi and Y = ΣyjXj.”

“I see where you’re going with this,” said Spherius. “We can now exploit linearity and our other properties and get a straightforward (though ugly) expression for the connection. 

DelXY = DelΣxiXi ΣyjXj = Σ i xiDelXi (Σ j yjXj) = Σij xiDelXi(yjXj) = Σij xiyjDelXi(Xj) + Σij xiXi(yj) Xj.”

“You weren’t wrong that that’s ugly,” laughed Arthur. “Let’s create some shorthand. Each of these DelXi(Xj) are linear combinations.”

“Right,” said Spherius. “I keep thinking it’s just the Kronecker-delta, but while the Xi are the image of orthogonal vectors, they are not orthogonal themselves.”

“Exactly. So it comes out to a linear combination of Xk, and the coefficients are dependent on i, j, and k. So we’ll denote the linear combination as DelXi(Xj) = Σ k Γ ij k Xk, and then

DelXY = Σk (Σij xiyj Γij k \+ X(yk))Xk.”

“I’m not sure that’s any simpler, Arthur Square.”

“Perhaps not,” said Arthur, chuckling again. “But at least now it’s written as a linear combination, and it shows the connection is dependent on the values of xi and yj at a point p, and also the partial derivatives of the Y field, X(yk), at p.”

“So all this is to say that if we have a parametrized curve and a vector field along it, we now have a method of differentiating the vector field along the curve. And since V yields velocity vectors, DV/dt should give us a way of describing the acceleration of a particle as it moves along the curve.”

“I think you’re right, though we should probably hold off on that for a bit and explore more of what this new construction does. For starters, we should describe a vector field that is constant on a curve. That would correspond to a particle moving at constant velocity, much like the raindrops. That should happen anytime there’s no push at all on the curve.”

“I hate to contradict you, Arthur, but you’re forgetting our earlier example. If I press against Flatland in such a way that all the vectors are parallel to each other, then I’m pushing directly down on the plane but causing no motion within it. So it may not be true that the vector field is constant from my perspective (I could push with different force at different points), but nevertheless its covariant derivative must be identically zero.”

“I see. So DV/dt=0, which means the velocity vectors are not changing. The covariant derivative, then, is a type of acceleration. So V could still send a particle along a curve c, so long as it gave the same tangent vector at every point. Well, the equivalent tangent vector in a different tangent space. So an example would be a person riding down a river with steady current- they are neither speeding up nor slowing down, and if they’re being pushed at all it must be an external push that produces no motion in Flatland.”

“We should have a name for these basic vector fields,” said Spherius. “After all, if I have a curve c in M and I pick both a starting point and a speed, then I know exactly how fast it is going at every point. So we’ll say if we pick a time t0 and a tangent vector V0 that is the velocity at c(t0), then we can construct a parallel vector field V along c where V(t0)=V0. We can call it the _parallel transport of V(t 0) along c_. And now my ‘parallel’ terminology comes into its own, since we are transporting the vector V(t0) to other points on c and laying it parallel to the original vector.”

“So V0 is just any vector in the tangent space. But isn’t t0 a real number? Shouldn’t V be evaluated on points of M?”

“I suppose that’s technically true, but we’re defining V only on c right now. That means that each point on c is the image of a real number, so we might as well pull back and consider V as acting on t, cutting out the intermediate step.”

“Fair enough. And I’m reasonably sure the parallel transport is unique. It may take me a little while to prove it, but it seems intuitively true, so we can move on for now.”

**Notes for the Chapter:**

> * The correct symbol for "Del" is an upside down delta, often referred to as "Nabla." However, that symbol is not supported on this site.
> 
> ** Attributed to George Box.


	11. On Compatibility and Symmetry of Connections

**Summary for the Chapter:**

> Arthur and Spherius decide what makes a connection "nice."

“I would like to know whether our new idea of ‘connection’ is compatible with a Riemannian metric.”

“Compatible?” asked Arthur. “What do you mean by that?”

“I was thinking about whether there’s any sort of ‘product rule’ for the derivative of an inner product,” answered Spherius. “Naively, it seems that given two vector fields V and W along a curve c: I-> M, the derivative of their inner product should be d/dt<V,W> = <DV/dt, W> \+ <V, DW/dt>, as in the product rule from elementary calculus. But is that true? Given how little we know about affine connections, I worry that they may be a complicating factor.”

“It’s good practice, mathematically, that you thought to explore this concept before an application was obvious. Let’s dig in a bit. Riemannian metrics are evaluated on the tangent space of a manifold, so let’s pick a point c(t0) on the curve and enter the tangent space there.” Arthur imagined the blue-gray plane he was becoming used to. “Now, since we’re in a vector space, we can define a basis and write V and W as linear combinations of basis vectors. In fact, let’s make it an orthonormal basis for simplicity’s sake. Call it {Pi(t0)}i=1 n. This will do for c(t0), at least.”

“I have an idea,” said Spherius. “We saw earlier that, since we have a connection Del on this manifold, we can generate a parallel transport of each one of these basis vectors. So extend them to {Pi(t)}i=1 n, which are defined for every t. And what’s more, all of their covariant derivative are zero, so a particle travels the curve at a constant speed.”

“That will do nicely. Then we can write the vector fields as V = ΣviPi and W = ΣwiPi, where each of the v’s and w’s are differentiable functions of t. And then by linearity, DV/dt = Σ(dvi/dt)Pi and dW/dt = Σ(dwi/dt)Pi. Then the inner product is easy to compute, since the vectors are orthonormal.”

“Wait!” shouted Spherius. “We don’t know that!”

“But we built the basis to be- oh. Ohhhh. You’re right, we only know they are orthonormal at t0, not at the other points we transport them to. But without that- we need some kind of consistency if we’re going to be able to do these calculations. Well, Spherius, you noticed it so what do you think we should do about it?”

“It seems to me the issue is that each inner product of basis vectors gives a certain value at the origin point of the curve, and we want that value to be the same across the curve. We could require that all inner products be constant on curves? No, that’s far too restrictive. But wait, the P’s are all parallel vector fields. What if we simply mandate that the inner product of any two parallel vector fields is constant along a smooth curve c?”

“Yes, I was… I was just thinking that,” said Arthur nervously. He was getting complacent- the sphere wouldn’t need his help much longer at this rate. “So that can be the definition of a connection being _compatible_ with a metric: Given any smooth curve c and two parallel vector fields P and P’, the inner product <P,P’> = constant. Now let’s get back to our proof. The right side of your proposed product rule is <dV/dt, W> \+ <V, dW/dt>, so maybe we can simplify it.”

“Both terms are bilinear, so we can expand them in the same fashion as the standard dot product: Σ(dvi/dt) wi \+ Σ(dwi/dt) vi = Σ[(dvi/dt) wi \+ (dwi/dt) vi]. But then each term of the sum-“

“Is the standard product rule of calculus!” shouted Spherius before Arthur could finish. “So it becomes Σd/dt[wivi] = d/dtΣ[wivi] = d/dt<V,W>. So it does work!”

“It does indeed,” said Arthur, pushing down his jealousy. “And the converse is obvious- if you know the product rule works, then d/dt<P,P’> = <0, DP’/dt> \+ <DP/dt, 0> = 0 for all t. I also think I see a way of characterizing compatibility that uses the connection directly.”

“Oh? What might that be?”

“Well, let’s say we have three different vector fields on M. Call them X, Y, Z. Then let’s consider what X<Y,Z> is.”

“Wait, what kind of object is that?” asked Spherius. “X is a vector field, so it takes in points of M, but <Y,Z> is a real number.”

“Yes, well, let’s pick a point p, and construct some curve c so that c(t0) = p and dc/dt (t0) = X(p), so that X and dc/dt give the same vector at p. But then if we focus in on c, we can apply d/dt to <Y,Z> using the product rule.”

“Interesting,” said Spherius. “This notation will be the death of me. But you’re saying that if you want to apply a vector field to an inner product, you must first define a curve and reduce the action of X to a derivative of a real valued function.”

“Exactly. Y and Z are both functions of t, so their inner product takes real numbers to real numbers, and therefore normal calculus applies. Then we can say that for any p, X<Y,Z> = d/dt<Y,Z> =

<DY/dt, Z> \+ <Y, DZ/dt>. But what are DY/dt and DZ/dt?”

“Since we’re evaluating along curves, they must be connections! DY/dt = DelX(p)Y and DZ/dt = DelX(p)Z.”

“Precisely. And since p was chosen arbitrarily at the beginning, we can say X<Y,Z> = < DelX(p)Y,Z> \+ <Y, DelX(p)Z>.”

“I see. So you’ve found a way to evaluate compatibility more directly. Instead of having to check an arbitrary pair of parallel vector fields on an arbitrary curve, you can just check three vector fields at a single point.”

“That’s the idea. And the form of it is the same as that of the product rule. Hopefully it makes calculations easier. And now, for another mathematician question,” said Arthur. “Is the connection that is compatible with the Riemannian metric unique?”

“You mean if we have a metric to begin with, can we find more than one connection to work with it? I suspect no, but you’ll probably want to prove it.”

“Right you are. I’m going to explore what happens if we take our new characterization and try to combine a few different orders. X<Y,Z> = <DelXY, Z> \+ <Y, DelXZ>, Y<Z, X> = <DelYZ, X> \+ <Z, DelYX> Z<X, Y> = <DelZX, Y> \+ <X, DelZY>.”

“Why would you want to do that?”

“Call it ‘mathematician’s instinct.’ I’m going to add together the first two and subtract the third, and if we’re lucky maybe some clutter will cancel out. X<Y,Z> \+ Y<Z, X> \- Z<X, Y> =

<DelXY, Z> \+ <Y, DelXZ> \+ <DelYZ, X> \+ <Z, DelYX> \- <DelZX, Y> \- <X, DelZY>. Now if we use linearity and symmetry, we can rewrite this as < DelXY + DelYX, Z> \+ < DelXZ - DelZX, Y> \+ < DelYZ - DelZY, X>. And that’s… well, it’s a little cleaner, I suppose…”

“But wait, what if we replace the differences in the last two terms with the Lie Bracket?”

“What? Where in space did you get that from?”

“Hear me out, Arthur. The connection is essentially showing us how quickly one vector field changes as we follow the path of another, correct? But if you think back to our work on local flows, we saw that the bracket [X,Y] was what happened when you monitored the change in Y along the flow of X. But that means that they describe the same concept, and DelXY – DelYX = [X,Y].”

Arthur was silent for a moment. “It can’t be that simple.”

“Why not? In fact, look what happens when we take the connections of the basis vectors and subtract them. DelXiXj – DelXjXi = [Xi, Xj] = 0 when the basis is orthonormal. And that means the connection is _symmetric_ for basis vectors: DelXiXj = DelXjXi.”

“But nothing in our definition of a connection requires this. You’re just going off vague intuition. I’m sure I could construct a connection that doesn’t have this symmetry property.”

“Can you think of any way to simplify things without it, though?”

Arthur stared at the equation for a long minute. “All right, I don’t think any connection is symmetric, but let’s suppose the one we’re working with is. What would be the next step?”

“We can rewrite < DelXY + DelYX, Z> \+ < DelXZ - DelZX, Y> \+ < DelYZ - DelZY, X> = <[X,Y] + 2DelYX, Z> +<[X,Z],Y> \+ <[Y,Z],X>.”

“Oh-ho, there we have it!” chuckled Arthur. “That last expression is equal to what we started with, X<Y,Z> \+ Y<Z, X> \- Z<X, Y>. But that means we could solve that equation to find <DelYX, Z> in terms of X, Y, Z.”

“And what would that be?”

“Hey, if you want to grind through that algebra, be my guest. All I care about is that it means we can find a unique connection this way. And that means that as long as our connection is compatible with the metric and symmetric, it is the only one. We should call this _the Riemannian Connection_.”

“I like the term _Levi-Civita Connection_.”

“I won’t ask why, but you can call it that if you like. What’s more, we can simply define Del according to the expression above, which means a compatible symmetric connection definitely exists on any Riemannian manifold.”

“Terrific,” said Spherius. “The only thing left to do is write this in a local coordinate system

(U, **x** ). We said before that we can always write DelXiXj = Σ k ΓijkXk . I want to call those coefficients of the connection the _Christoffel Symbols_.”

“Sure, why not. So then let’s work out those coefficients. We’ll need to deal with the inner product, so we’ll go back to when we defined the Riemannian Metric and write gij = <Xi, Xj>, where Xi are the vectors in the tangent space that are the images of the standard basis vectors in local coordinates. Then the connections… ah, nuts, I think we are going to have to solve that equation.”

“I don’t think it’s as bad as all that,” said Spherius. “We just have to subtract off a few terms and divide by 2. X<Y,Z> \+ Y<Z, X> \- Z<X, Y> = <[X,Y] + 2DelYX, Z> \+ <[X,Z],Y> \+ <[Y,Z],X> =

X<Y,Z> \+ Y<Z, X> \- Z<X, Y> = <[X,Y], Z> \+ 2<DelYX, Z> \+ <[X,Z],Y> \+ <[Y,Z],X> =>

<DelYX, Z> = ½[ X<Y,Z> \+ Y<Z, X> \- Z<X, Y> \- <[X,Y], Z> \- <[Y,Z],X> \- <[Z,X],Y>]. So that’s a decent formula. We just have to use it to find the Christoffel Symbols of the connection, and doing it for the basis vectors alone should be enough.”

“All right, then let’s set up some terminology. We have gij = <Xi, Xj>, and DelXiXj = Σ kΓijk Xk , where those gamma expressions are the coefficients. Now let’s say we take any three of the basis image vectors that we can call Xi, Xj, Xk and try to evaluate our formula.”

“We can substitute those vectors in for X, Y, and Z, and since the connection is symmetric all of the brackets are zero. What’s more, each of the vector fields could also be seen as a partial derivative, so the expression reduces to ½[d/dxi(gjk) + d/dxj(gki) - d/dxk(gij).”

“On the other hand,” said Arthur, “what if I plug in directly to the left-hand side? So <DelXjXi, Xk> = <Σ LΓjiL XL, Xk>.”

“Where did that L come from?”

“It’s just a dummy variable, something we can use to index and then sum away. But k is taken, so we can’t use it. Then we’ll use linearity once again, and we’re left with Σ LΓjiL < XL, Xk> =

Σ LΓjiL gLk.”

“Then we put these together, and Σ LΓjiL gLk = ½[d/dxi(gjk) + d/dxj(gki) - d/dxk(gij)]. Now we just need to solve for the Christoffel Symbols.”

“Let me think about this,” said Arthur. “The g’s are all basis vectors- no, wait, each individual gLk is a single entry in a vector, and together they make up a basis. But that means that the gLk form a square matrix of basis vectors, so it has an inverse! So we can multiply by the inverse matrix to get an equation: Γjim = ½Σ k[d/dxi(gjk) + d/dxj(gki) - d/dxk(gij)]gkm. I had to switch dummy variables again to m, just because the values might have changed. And gkm is just the inverse of gkm .”

“I noticed that the i,j,k symbols "cycle" in the three expressions, so hopefully that will make it easier to remember. Then we can also use this expression for the covariant derivative.

DV/dt = Σ k[dvk/dt + Σ ij Γijk vj (dxi/dt)]Xk.”

"Also, in Euclidean space, all the partial derivatives in the Christoffel symbols are zero, since the basis vectors are perpendicular, and that means the definition of the covariant derivative is the same as the standard derivative in Euclidean space. This may be another example of curvature manifesting itself."

**Notes for the Chapter:**

> The reader may be questioning why these characters use the names of our mathematicians such as Christoffel or Levi-Civita. The truth is they do not; they have their own names. However, while I have generally been faithful in my transcription of the events, the purpose of this manuscript is to aid readers in learning mathematics as it is taught here. Therefore I have chosen to replace the names they use with the common names in our world. I hope the reader will indulge me in this departure from strict accuracy.


	12. How Geodesics were Defined

**Summary for the Chapter:**

> The shortest distance between two points is a straight line... right?

“Spherius, my friend, it’s been too long!”

“Apologies for the absence, Arthur, I have had to deal with some practical matters. But I have been rereading our notes from before, and I think I’m ready to jump back in to our research.”

“That’s good to hear, as I’ve been fairly itching to explore some more as well. My teaching schedule is rather light this term.”

“In that case, I have something deceptively simple to discuss: lines.”

“Lines?” asked Arthur. “Do you mean one-dimensional manifolds?”

“That is far too general. Walk to that stump on your right, in the shortest distance possible.” Arthur frowned, but did as Spherius suggested. “No, no, I said the shortest distance possible!”

“I did, Spherius- I walked a straight line.”

“Ah, but you did not!”

“You’re getting at something, but I’m having trouble seeing it.”

“Remember, Arthur, that you live in a manifold embedded in the third dimension. The path you took had an upward slope, giving it curvature. A straight line would pass through the third dimension, and would be a considerably shorter distance.” 

“I see. Like if a bug were running on a curve in my world, it could get to the end more quickly by leaving the curve. But, by the same token, there may be a yet shorter path than yours if you could cut through the fourth dimension.”

“Exactly! Imagine what we could do if we could take shortcuts through higher dimensions! Why, it may be that a planet lightyears away is only a few miles if we could pass through those worlds!”

“And you know how to do that?”

“Well… no,” Spherius admitted. “Our scientists have yet to fully unlock the higher dimensions. But if we learn how you are affected by moving through the third dimension, perhaps that theory can be applied to even greater ends.”

“All right, let’s do that. I suppose the first thing to do is describe what a line is. To me, at least, I walked a straight line to get where I was going in a minimum distance. So I suppose what we’re describing is a path between two points that minimizes the distance _without leaving the manifold_.”

“That’s right. This is a common problem when navigating the surface of a sphere, where finding the best path is called geodesy, so I propose these generalized lines be called _geodesics_.”

“I don’t know what a sphere is, but I will accept this terminology. Of course, on a one-dimensional manifold it is pretty easy. If the king of Lineland had wanted to visit his subjects far away, there was only one path available for him."

"Again, though, a line is not the only option. What about a circle? That is a type of one-manifold, but there are two paths to any other point. Does the King go left or right? One is shorter, yet they both look like straight lines. And on a sphere, you can continually move north and return to where you started."

"I have more options than the King of Lineland, and you have even more in Spaceland. So what is a geodesic?” Arthur puzzled for a minute, thinking of the bug on the curve. “The definition of a line in Euclidean Geometry is that it has a constant slope, which in calculus is equivalent to having a constant derivative. So on a geodesic curve γ, we should have that dγ/dt is constant.”

“Another way of saying that,” said Spherius, “is that the covariant derivative on γ is zero. After all, the geodesic is only the shortest path if there is no force pushing you out of your manifold.”

“That sounds right. So let’s define a geodesic as any curve γ:I->M where D/dt(dγ/dt) = 0 for all t in I. And as an analogy with line segments, we’ll say the image of a closed interval [a,b] under a geodesic curve is a geodesic segment.”

“You mentioned dγ/dt= c, where c is some constant, but that’s not necessarily true, is it? I mean, from your perspective you’re traveling along a straight line, but from my perspective there are twists and turns that cause the vector to change over time.”

“You’re right… let me think about this one… I think the issue is that speed is constant, not velocity. The variable t… well, it’s easy to think of it as time, but it’s really a parametrization of an interval, and I suppose you can think of the interval as an interval of time if you travel at a constant speed… this is all getting a bit circular. So let’s step back and reiterate the definition of a geodesic: It is any parametrized curve in a manifold where D/dt(dγ/dt) = 0. So from my perspective it’s a particle that travels in a straight line at a constant speed, but what we’re really saying is that there’s no meaningful outside force acting on it.”

“That feels correct,” said Spherius. “Not only outside forces, though- you cannot exert a force to leave Flatland from within, either. Now since your speed is constant from within, that means that if you were to enter the tangent plane at some point, it would always be at that speed, so |dγ/dt| = c, where c is any positive value. The tangent vector changes, but its length is always the same.”

“There we go, I feel like we’re really on the right track now. Though technically c could be 0, but we can probably ignore that case since it would just be a point. Let’s check that it fits our definition formally. |dγ/dt| is Sqrt< dγ/dt, dγ/dt>, so d/dt| dγ/dt|2 = 2< D/dt(dγ/dt), dγ/dt> = 0. Then we’re right, | dγ/dt|2 is constant, so | dγ/dt| is as well.”

“One immediate use for this is that it allows us to measure arc length. The arc length formula in calculus is ∫ab |dγ/dt| dt, so on a geodesic the arc length would be c(b-a).”

“And we don’t have to parametrize by time, then. We could parametrize by distance. I could just tick off the number of feet I walk along the path, and then |dγ/dt| = |ds/ds| = 1. So then the arc length is just b-a, the ending point minus the starting point. Spherius, you like using local coordinates, would you like to take a look at them?”

“You read my mind. In local coordinates, a path is expressed as γ(t) = (x1(t), x2(t),…,xn(t)), so each of these functions will have a covariant derivative of 0. So dγ/dt = (dx1/dt, dx2/dt,…, dxn/dt). Then we apply the expression we derived before with the Christoffel Symbols:

D/dt(dγ/dt) = Σk{d(dxk/dt)/dt + Σi,jΓijk(dxi/dt)(dxj/dt)} ∂/∂xk = Σk{d2xk/dt2 \+ Σi,jΓijk(dxi/dt)(dxj/dt)} ∂/∂xk. Note the use of the superscript on the ∂/∂xk to indicate it is a basis vector, as opposed to dxj/dt, which is a derivative of a real-valued function.”

“So you’ve reduced the problem to a system of differential equations of the form d2xk/dt2 \+ Σi,jΓijk(dxi/dt)(dxj/dt) = 0. I’ll admit, differential equations are not my forte- I tend to leave them to more scientific minded folks like yourself. But at least we now have a well-defined problem. Since we have a mix of derivatives in the tangent space evaluated at different points, I think it would be wise to reexamine the tangent bundle.”

“Remind me what that is?” asked Spherius.

“Simply put, it’s the pairing of points on the manifold with potential velocity vectors. We know that at any point on a manifold, an entire space of vectors is available for a particle to escape into the tangent space along. So TM is a larger space of pairs (q, v), where q is in M and v is in TqM.”

“Oh, I see where you’re going with this. Each vector in the tangent space is a linear combination of basis vectors: Σi=1n yi ∂/∂xi, where the y’s are all real-valued functions of t. That means that we can start with our parametrization γ(t) that gives a curve in M, and append the y’s to it to get a function that gives both position and velocity at any time: (x1, …, xn, y1, …, yn).”

“But TM has its own differential structure,” said Arthur. “What that means is that the vector you constructed is in fact a parametrized curve in TM. Furthermore, each of those y’s is the derivative of the corresponding x: dxk/dt = yk.”

“So the derivatives of the first n coordinates are y’s. What about the derivatives of the y’s?”

“We’ll have to use our earlier equation, the one with the Christoffel symbols. Each component of the covariant derivative is written d2xk/dt2 \+ Σi,jΓijk(dxj/dt)(dxi/dt), or substituting yk = dxk/dt, we have dyk/dt + Σi,jΓijk(yiyj). And since the covariant derivative is zero on a geodesic, we get dyk/dt = - Σi,jΓijk(yiyj). I guess that’s simpler, but does it help?”

“Immeasurably!” shouted Spherius, causing Arthur to jump. “The y’s are all real-valued functions, meaning the equations dxk/dt = yk and dyk/dt = - Σi,jΓijk(yiyj) are first-order differential equations! This is so much simpler than the second order system we started with.”

“And that means we can locally solve for a unique trajectory that satisfies the system!” exclaimed Arthur. “So if we know a few parameters, there’s only one possible geodesic!”

“Right, it’s a basic conclusion from differential equations.” Spherius was talking faster than usual, as though caught up in the excitement. “There should then be a vector field on TM that gives geodesic flows.”

“Flows,” repeated Arthur. “A flow involved picking a starting point and watching how it was moved by a vector field over time. So we could define it as a function where if we fix a time t, then ϕt(q)=ϕ(t,q), meaning that a starting point q is mapped to an entire curve.”

“Yes, and then we should be able to find a unique vector field G for TM that will take any t to a geodesic curve (γ(t), γ’(t)), that being the vector field that solves the above system of equations. So we’ll call G the _geodesic field_.”

“Then this geodesic field will send everything along a ‘straight line’ path,” said Arthur. “We could picture one of the straighter parts of the river, which pushes everything along in a straight line at constant speed. I would see myself as moving in a straight line. You might see me going up and down, but there would be no acceleration detectable from within Flatland. So every point in the river would be pushed along a geodesic path of some sort, since there would be no force accelerating or decelerating me through Flatland.”


	13. How the Exponential Map was Derived

**Summary for the Chapter:**

> Arthur and Spherius work more with geodesics and define a new map.

“I think we need to look deeper into curves that come from points,” announced Arthur. “We’ve talked a lot about using the flow to define an entire curve, and functions taking points to curves may be a richer vein than we realize.”

“I agree,” said Spherius. “Of course, we also need to pick an initial point and a velocity. Time, starting point, and velocity are three parameters, and then there should be a unique geodesic that is generated by those parameters.”

“Right, that was the conclusion of all our work with differential equations earlier.”

“So pick a starting point in M and a velocity in the tangent space-“

“I have a better idea,” said Arthur. “What if we simply pick a point in the tangent bundle? Say (p,0) in in TM. Then that ‘point’ contains information about both position and speed.”

“But there is no speed now. The second coordinate is zero, so it’s stationary.”

“Ah, but here’s the beauty of it. Since TM is a manifold, we can pick an open set around it. Let’s call that set U, and we’ll say it includes all points near p with small velocities. That is to say, there’s some open neighborhood V in M that contains p, and there’s some ε1 > 0 so that any point (q,v) in U is a pairing of a point q in V and a velocity with magnitude less than ε1. That way U contains all possible combinations of points and velocities, which are enough to start a curve.”

“And then a particle moves along the curve,” mused Spherius. “If we pick a point and follow it over time, but for how long… over an interval!” he cried. “We have seen that the geodesic is a local solution of the equation, so it must be valid on a neighborhood of time, position, and velocity. But a local neighborhood of time is just an interval, so if we call our initial time zero, then the solution is valid on times between -δ and δ, where δ is some positive number.”

“All told, then, the geodesics are given by the function γ: (-δ, δ) x U -> M, so the triple of time, starting position, and velocity will give you your ending point. If we leave t as a variable, then γ(t,q,v) traces out the unique geodesic that passes through q with velocity v at time 0.” (Figure 13.1)

“It occurs to me, though, that time and velocity are inversely related. Is it fair to say that we could increase the speed and decrease the time, and still find a solution to the system?”

“That makes sense, but let’s see if it matches our definitions,” said Arthur. “Our intuition from physics is that if we divide time by _a_ and multiply velocity by _a_ , we should have the same point. Symbolically, γ(t, q, av) = γ(at, q v), where the latter is defined on 1/a of the original time interval. If we want this to be a curve, though, we need to specify that it is a function of time alone. So let’s say we started with a geodesic γ(t, q, v) on the time interval (-δ, δ), and we want to compare h(t) = γ(at, q, v) on the time interval (-δ/a, δ/a). It’s easy to see that h(0)= q and h’(0) = a γ’(at, q, v) |0 = av.”

“And that is still a geodesic, correct?”

“We can check the covariant derivative. Let’s see, that means we need a connection, which we should already have. h’(t) can be extended to some neighborhood of h, and then D/dt(dh/dt) = Delh’(t)h’(t) = Del a γ’(at, q, v) a γ’(at, q, v) = a2 Del γ’(at, q, v) γ’(at, q, v) = 0, since we defined γ to be a geodesic and thus have zero outside force acting on it. So h is a geodesic, and it’s equal to both γ(at, q, v) and γ(t, q, av), meaning they equal each other.”

“Which is exactly what I said to begin with,” said Spherius. “Leave it to a mathematician to spend five minutes showing why something that works in practice must be true in theory.”

Arthur laughed. “And leave it to a physicist to barge headlong into the fray with something that’s probably true. All right, what comes next?”

“I think the practical value of this definition is that we can make the time neighborhood whatever we like simply be altering the speed. If we wanted to observe a particle on the time window of (-2,2), for instance, then from any starting point and starting velocity we can find a unique geodesic that meets those parameters and will be valid for times between -2 and 2 seconds. All we have to do is scale the velocity correspondingly.”

“We could switch the variables, as well,” said Arthur. “Instead of having a function of time, we could have points in TM as the domain and track their position after a set amount of time. Take an open set U in TM, and then there is a function that will map any point in TM (i.e., a pair of a starting position and a velocity) to- well, not a final position certainly, but the position it will reach after exactly one second. I’ll call that the _exponential map_.”

“Why in space would you call it that?”

“I’m… not exactly sure. An intuition, I suppose. So we define exp(q,v) = γ(1, q, v), which will be the position a particle reaches after traveling along the appropriate geodesic for 1 second.”

“We could also normalize the velocity vector,” said Spherius. “Our earlier scaling lemma says we could rewrite your ‘exponential map’ as exp(q,v) = γ(|v|, q, v/|v|), meaning v/|v| is a unit vector that gives the starting direction and |v| tells us how long to travel the geodesic. This map is differentiable, obviously.”

“Why is that obvious?” asked Arthur. “We would need to find a linear approximation of the map, a 2n x n matrix that represents an approximation of the function at a given point (q,v) in TM-“

“Don’t overthink things, Arthur. Remember, differentiability is only meaningful in local coordinates. And you can see the local coordinates.”

Arthur looked around, and suddenly began imagining a grid near him. “Ah, of course. I call this starting point q and pick a direction v, then run along in for |v| seconds, where |v| is my speed. In local coordinates, my path is a parametrization of an interval in **R** , with q as the image of 0. Then I’m just following a path at fixed speed, so it’s essentially a linear function.

“Still, having to parametrize by q and v is inconvenient; it might be better to fix the starting point. expq(v) = exp(q,v) could be defined on an open set of TqM, i.e., for all velocity vectors of small magnitude. Then we have a fixed point q and a time 1, so exp would be a function of initial velocity only. This is now a function from TqM to M, which are both n-dimensional manifolds, so I’m pretty sure it would be a local diffeomorphism as well.”

“Probably, but you’d never forgive yourself if we didn’t prove it. Let’s see, we want to compute d(expq)(v) and evaluate it at time zero. The differential of a map pushes a tangent vector forward to another tangent vector. Going by what you said earlier, we can multiply v by a time t and plug that in to the exponential map to get a final position: expq(tv).”

“I think I see what you’re saying,” said Arthur. “A differential is really saying ‘does the velocity change under the action of the map?’ So v is our initial velocity, which is fixed, and we want to see if it changes over time, which means we need d/dt(expq(tv)) = d/dt(γ(1,q,tv)) |t=0.”

“And we use homogeneity!” they both shouted simultaneously. They laughed, and Spherius concluded, “If we replace it with d/dt(γ(t,q,v)) |t=0, then the answer is just v. So if we pick a starting velocity, then under the exponential map we end with the same velocity.”

“Which means the differential is just the identity. And since that is non-zero (or, rather, the identity matrix is non-singular), it means that the exponential map is a local diffeomorphism.”


	14. How They Cast a Net and Caught Gauss's Lemma

Spherius was reading the reports about a new energy program. Some of his colleagues were building a sort of net around a spring where water came up in the center of a lake. The idea was to place a hemispherical dome over the spring, and then water passing through the net would generate electricity. The engineering of it wasn't much to his interest, but he was concerned about the idea of wasted energy. Most efficient was if the water went through the net at a perpendicular, but if the net was too large the flow went in more random directions. Not only was the energy wasted, but the net itself might be thrown out of alignment. Smaller nets were more difficult to hold in place, yet it might be worth it if they captured more of the energy. In theory, a very small net would capture 100% of the energy- wouldn't it?

“Theoretical problems,” mused Spherius. “I know whom to speak with now, don't I?” Spherius found Arthur and explained the nature of the problem.

“Interesting,” he said. “In my world, you'd be drawing a circle around a point that radiates energy outward. And if I'm understanding your problem correctly, this 'net' is a 2-manifold around a point in Spaceland. It does seem amenable to our processes. The first step will be to state the problem in mathematical terminology. A water particle that leaves the spring will travel outward at a perpendicular direction, correct?”

“Isn't that begging the question?” asked Spherius. “We want to see if they cross the net orthogonally, but we don't know it yet.”

“True, but I'm not talking about when it reaches the net. I'm talking about its instantaneous velocity at the moment it leaves the spring. Any straight line through the center of a circle must be perpendicular to the circle, so any vector exiting the center at least starts perpendicular to the circle. Our concern, then, is whether the ambient forces in the lake cause it to go off track.”

“Those ambient forces could be expressed as a vector field. And moreover, at any point you could see the vector field as acceleration acting on the preexisting vectors.”

“Right,” said Arthur. “I think what we're interested in is decomposing these acceleration vectors into parallel and perpendicular components. If all of the acceleration is parallel to the initial velocity, then that's your ideal situation. So ideally, the inner product of a vector v and its acceleration vector w would be zero.”

“The idea of acceleration vectors is tricky. Up until now, we have always thought of the tangent vectors as lying in a tangent space TpM to the manifold M. But if we want acceleration, it must lie in the tangent space to the tangent space, Tv(TpM).”

“You're right, but I don't think it will be that difficult. After all, TpM is already a plane. I know that the tangent line to a line is itself, so the tangent plane to a plane must also be itself.”

“Ah, you have a point. So w is in Tv(TpM) = TpM, at least as far as practical calculations are concerned. Then v and w are vectors in the same space, and their inner product is of interest. We know that for small vectors, the inner product will be small as well, so at least on a small net our inner product is close to zero even if the vectors are not perpendicular.”

“True, though that doesn't tell us what the vectors will be at the end. We need to know how they are changed over time.”

“Wait,” said Spherius, feeling something dawn in his mind. “Wait, I think I'm about to be brilliant. Listen, we want to see what happens when water shoots out on an initial velocity vector, correct? So, we need to follow that along the most natural path imposed by the vector field, which will be a geodesic. But that means we're evaluating the exponential map. And since we're trying to find the new velocity vector at the end, we need a map that takes velocity vectors to other velocity vectors. Well, that's the differential map, so we're really talking about the differential of the exponential function!”

The words hung in the air for a few moments. “I think you have something there,” Arthur finally responded. “The inner product of <(dexpp)v(v), (dexpp)v(w)> would represent the final state of the initial velocity vector and the acceleration vector after they've moved out for one second. Or, thanks to homogeneity, any time and distance you want. Of course, this assumes that the exponential map exists.”

“Why would it not?”

“I'd have to think on that one. But it seems to me that we defined it based on the assumption that our vector field defined geodesics everywhere, and can we be sure that that is true? If we let our particle stray too far from its source point, the ambient forces may overwhelm it to the point where there are multiple equally valid geodesics.”

“Perhaps,” said Spherius, “but I think in most normal neighborhoods of p it will be well-defined.”

“Normal neighborhoods?”

“Well, I mean that velocity vectors of small magnitude are taken to their ending points smoothly and bijectively, so we don't have to worry about going to multiple end points or traveling non-geodesic curves. I suppose what I mean is that the exponential map is a diffeomorphism about the origin. Last time we saw that it's always at least possible to find an open ball about the origin where the exponential is a diffeomorphism, so balls with small radius are normal.”

“Seems appropriate,” said Arthur. “Open balls are about the most 'normal' things I can imagine. So assuming the exponential map is well-defined, we'd like some way to evaluate the inner product of the final vectors <(dexpp)v(v), (dexpp)v(w)>, and hopefully it's very small for a small net. I think the best way to start would be to break up w into wT \+ wN, the sum of the perpendicular and parallel components to v. It's obvious that pushing v parallel to itself will not affect the direction of the final vector, so we can focus on the perpendicular component <(dexpp)v(v), (dexpp)v(wN)>.”

“WN is perpendicular to the velocity vector,” said Spherius, “and the velocity vector is based at the origin. That means that wN must define a circle.”

“Ooh, very interesting. So we can express the acceleration of v by saying that v(0)=v, v'(0) = wN, and |v(s)| = k, some positive constant.”

“So v(s) essentially finds new velocity vectors by rotating the original vector along some arc. But we also have to consider how long we let the particle run, which is the magnitude of the vector.”

“Correct, so that means we have two parameters t and s. We don't want to let things get out of hand, so we'll insist s stay small enough for the neighborhood to be 'normal,' i.e., the exponential map still exists. Then we can say u = tv(s) with 0≤t≤1 and -ε<s< ε. U is then the vector we get from going out a time t at an angel s. This lets us parametrize a wedge of your net.” (Figure 14.1)

“And now you're parametrizing wedges,” sighed Spherius. “Can we spell out exactly what that means?”

Arthur laughed. “Well, you're talking about a net as some sort of 2-manifold, meaning to me it would look like my entire world. If I were to go there, I could mark it with local coordinates, meaning it would be locally R^2. Then I suppose a surface would be some mapping from a connected subset A of R^2 into M. And it should probably be differentiable, meaning that if we extend A to an open subset, that mapping is differentiable.”

“That sounds fair. However, I would insist that the boundary of the surface be an at least piecewise differentiable curve.”

“The end of the world can't have too many kinks,” said Arthur. “Fair enough, though I could imagine some pieces of the boundary might be missing. Perhaps it is open at some areas and closed at others.”

“That could be. So your parametrized wedge is a function f:A->M, with A={(t,s)| 0≤t≤1, -ε<s< ε}. f determines the ending points of a particle shot out from the spring, so it must be the exponential map applied to our vector u: f(t,s) = expp(u) = expp(tv(s)). What's more, if we fix s0, then t -> f(t, s0) defines a specific geodesic based on starting along a particular radial direction.”

“And that means that the partial derivatives of f correspond to the parallel and perpendicular motion. ∂f/∂t is how f moves outward from the origin, so parallel to v, while ∂f/∂s represents its motion perpendicular to v. (Figure 14.2)

And the exponential map also refers to traveling along our original direction for a time of 1 second. So we can rewrite our original inner product <(dexpp)v(wN), (dexpp)v(v)> = <∂f/∂s, ∂f/∂t> evaluated at t=1 and s=0. Now we just need a way to simplify this further.”

“You know,” said Spherius, “I'm reminded of a mantra my first Calculus professor had: When it doubt, take a derivative. Let's try finding the covariant derivative of this inner product.”

“I suppose it can't hurt,” said Arthur. “Although if we're going to be able to apply the 'inner product rule,' we need to know the affine connection on this space is compatible with the vector field.”

“I feel it's usually safe to assume the connection is compatible. We're talking about geodesics, after all, so parallel vector transports should work nicely with them. Now let's apply ∂/∂t<∂f/∂s, ∂f/∂t> = <D/∂t (∂f/∂s), ∂f/∂t> \+ ∂/∂t<∂f/∂s, D/∂t(∂f/∂t)>.”

“That entire last term is zero, since ∂f/∂t lies in the manifold and so it has no external acceleration. And we can rewrite the first term as <D/∂s (∂f/∂t), ∂f/∂t>.”

“What makes you say that?”

“Oh, just some figuring I did earlier, turns out D/∂v(∂s/∂u) = D/∂u(∂s/∂v) as long as the connection is symmetric. I'll send it over to you later.” (See Lemma)

“Very well, I'll trust your calculations. Then <D/∂s (∂f/∂t), ∂f/∂t> = ½ ∂/∂s <∂f/∂t, ∂f/∂t>, by another application of the product rule. Oh, but then that's zero, since that inner product is just the norm of the radial vector, which won't be changed by rotation. So then ∂/∂t<∂f/∂s, ∂f/∂t> = 0, meaning the inner product is independent of time.”

“You're saying, then, that the inner product is the same all the way along the radial vector. So ∂f/∂s(t,0) = ∂/∂s expp(tv(s)) |(t,0)= (dexpp)tv twN, since the starting value of s is wN. But then we can take t to 0, meaning we make the initial velocity extremely small, and then that means the exponential map will send a point to a very close point. Then the limit as t->0 is in fact 0.”

“And since the derivative is identical on the radial vector, it must simply be zero, meaning there is in fact no perpendicular motion when the particle hits the net! So all of the motion is determined by the radial velocity, or <(dexpp)v(v), (dexpp)v(wT)> = <v,w>. Whatever initial push the particle gets after leaving the spring can will be carried entirely with it when it hits the net, so none of its force is wasted.”

“Exactly as you wanted,” said Arthur. “Although you could even drop the T from wT, since we know the normal component will contribute nothing to the result. So let's simply say <(dexpp)v(v), (dexpp)v(w)> = <v,w> as long as the exponential map is defined. And furthermore, if we're on a normal neighborhood of p, which you defined as a neighborhood where the exponential map is a diffeomorphism, then this result holds. If it's also an open ball, then its surface is perpendicular to all of the lines radiating out from the center p.”

“That result looks familiar,” said Spherius, flip back through some old notes. “Of course! We said if <u,v>p = <dfp(u), dfp(v)>f(p), then we call f an _isometry_. So we could restate this result as saying that a ball is normal provided the exponential map is an isometry. That brings everything together.”

“Indeed it does,” said Arthur as his stomach began to rumble. “Oh, and I think it's time for me to go grab some lunch. I didn't expect we'd be so long at this. Until next time, my friend.”


	15. On the Discovery that Geodesics are Locally Minimizing

“It's time to talk some more about neighborhoods,” announced Spherius. “Now that we've defined normal neighborhoods as any neighborhood where the exponential map is diffeomorphic on some set of vectors of small magnitude, we should see what other properties they possess. Hopefully, Gauss's Lemma (which we recently proved) will help us.”

“I think it will,” said Arthur. “There's something that's been eating at me for a while. We started studying geodesics as a generalization of lines, meaning the shortest distance between two points. A geodesic is any path that looks straight to me from inside the manifold, though it may appear curved to you. But we couldn't quite prove it, and we had to fall back on a definition about the covariant derivative vanishing. Maybe now we can actually prove that geodesics are minimizing.”

“It's worth a try. You are proposing that a geodesic connecting two points will have a shorter length than any other path?”

“I think we need to be a little more restrictive. 'Paths' are merely continuous, and some continuous curves have no clearly defined length. Coastlines, for example, since they are fractal by nature. Our definition of arc length is dependent on the derivative, so we should require other paths to be at least piecewise differentiable.”

“That's fair,” agreed Spherius. “So we have p as a point in M with a normal neighborhood U. In fact, to be even simpler, let's say B is a normal ball inside U which is centered at p.”

“That should be fine, since U is open and any subset of it will still be normal. Then we'll take a geodesic segment γ:[0,1] - > B with γ(0) = p, and c:[0,1] - > M is any piecewise differentiable curve joining γ(0) to γ(1).”

“You're allowing c to leave the neighborhood of p?”

“Why not?” asked Arthur. “After all, if the geodesic is the optimal path within the neighborhood, it's hard to imagine it could be beaten by leaving said neighborhood.”

“I suppose, though I think we should start by assuming c does in fact lie in B. Then we have l(γ) ≤ l(c), and they can only be equal if they are the same curve.”

“Okay, we've got ourselves a proposition, let's get to it. You stipulated that c([0,1]) is contained in B. Since we have the exponential map at p is a diffeomorphism on U, that means every point in U can be expressed as the image of some velocity vector based at p. Then we could see the exponential map as tracing out the image of c(t) by evaluating at specific vectors based on t.”

“I agree, but I'd like to separate the vector into a unit vector and a scalar. So we can say we're evaluating expp(r(t)v(t)), where r is real-valued and v is vector valued with |v(t)|=1 for all t.” (Figure 15.1)

“Technically, what you're saying is that v(t) is a curve in TpM- we could picture it as a point moving along a unit circle and returning the vector connecting the center to that point. Then r gives us a speed to travel along the geodesic leaving p in the direction of v. We could define f(r(t),t) = expp(r(t)v(t)).”

“So c(t) is precisely f(r(t),t), then,” said Spherius. “Then we can differentiate it and get dc/dt = ∂f/∂r(r'(t)) + ∂f/∂t. To calculate length, we'll then need to take the inner product of dc/dt with itself.”

“Just some good old-fashioned number crunching, then. |dc/dt|2 = <dc/dt,dc/dt> = <∂f/∂r(r'(t)) + ∂f/∂t, ∂f/∂r(r'(t)) + ∂f/∂t>, which by linearity is <∂f/∂r(r'(t)), ∂f/∂r(r'(t))> \+ <∂f/∂r(r'(t)), ∂f/∂t> \+ <∂f/∂t, ∂f/∂r(r'(t))> \+ <∂f/∂t, ∂f/∂t>.”

“But I think Gauss's Lemma can eliminate a lot of that,” said Spherius. “We've essentially parametrized f in terms of r and t in the same sense as we parametrized the wedge as s and t in our earlier proof. So <∂f/∂r, ∂f/∂t> = 0, which knocks out the middle two terms.”

“You're right. And what's more, r defines speed, meaning doubling the speed will double the distance we travel along the curve. So f(r) is linear with constant slope 1, |∂f/∂r| = 1. Together, that boils the whole thing down to <(r'(t)), (r'(t))> \+ <∂f/∂t, ∂f/∂t> = |r'(t)|2 \+ |∂f/∂t|2.”

“That does it, then! We have |dc/dt|2 = |r'(t)|2 \+ |∂f/∂t|2, so |dc/dt|2 ≥ |r'(t)|2. (1) Then we take the square roots and integrate, and that gives us the length of c is less than the length of r.”

“We need to be a little more careful than that,” said Arthur. “We can't really define r(t) when t equals 0, since we could imagine the particle moving at a million miles per hour for no time if we wanted. So let's say ε is ∫ε1|dc/dt| dt ≥ ∫ε1|r'(t)| dt ≥ ∫ε1r'(t) dt = r(1) – r(ε). But then if we take the limit as ε goes to zero, we get precisely that ∫ε1|dc/dt| dt ≥ r(1) - r(0).”

“You mathematicians and your limits. Regardless, this gets us where we want, since the left-hand side is the length of c and r(1) = |r(1)v(1) - r(0)v(0)| = l(γ).”

“Now that's progress,” said Arthur. “Let's clean up the edge cases we skipped over. We said earlier that the lengths should be equal if and only if the curves are the same. We can see easily enough that if (1) is strict, then the integral inequality is also strict, and l(c) > l(γ). And if it's not strict, then

|∂f/∂t| = 0, so the direction of the velocity is constant. So c is really just a reparametrization of γ and their lengths will turn out to be the same.”

“Lastly, we need to show that there's no shortcut available by leaving the normal neighborhood. Say c([0,1]) is not contained in B. Now c still starts at the center of B, so let's say t1 is the first point whose image is in the closure of B. Then l(c) ≥ l[0,t1](c) ≥ ρ > l(γ), where ρ is the radius of the ball. In other words, a path that leaves the ball must first go to the edge of the ball, and that alone is at least as long as any path in the interior of the ball.”

“That about covers it,” said Arthur. “But the most important thing to keep in mind is that it's a local result. We just showed that a geodesic between two points is automatically the shortest path between those points as long as they're in the same normal neighborhood. But if they are far apart, it's easy to see that there could be multiple geodesics connecting the points that aren't all minimizing. Going in the positive and negative directions on the unit circle, for example.” (Figure 15.2)

“You're right. Or circumnavigating the globe- the equator is a geodesic going east or west, but generally only one of those is minimizing.”


	16. How They Thoroughly Minimized Geodesics

“I have a bit of a topological concern,” mused Arthur.

“I'd suggest seeing a doctor.”

“Very funny. No, I was thinking about normal neighborhoods and wondering if we could do better. You see, in topology we have open balls, which are centered at a point, but more generally open neighborhoods are sets where every point has an open ball around it that stays in the set. The open balls become a basis. I was wondering if maybe there's something better than a normal neighborhood.”

“Before we get too deep, remind me of our definition of a normal neighborhood?”  
“We described it as a neighborhood that is the image of applying the exponential map to an open ball in the tangent space, and that mapping is a diffeomorphism. That is to say, it is the result of applying the exponential map smoothly to all vectors shorter than a certain length. So we can cover an entire normal neighborhood from its center, but I was thinking we could start from any point in the neighborhood and still cover it.”

“If you think this is a profitable line of questioning, then we can take a look,” said Spherius. He sketched a couple drawings to illustrate the difference between the situations. (Figures 16.1 and 16.2)

“We should start by explicitly stating the problem. You're saying that we want to start with a point p in M, and then find a neighborhood W so that every q in W can be the base of a smooth exponential map that covers all of W.”

“Right, so expq is a diffeomorphism from Bδ(0), a subset of TqM, to some open set that contains all of W. Maybe W itself, maybe even more.”

“What is δ?” asked Spherius. “Is it the same number at every q, or does it depend on q?”

“I guess we'll find out. It's always nice to be uniform, but we'll see if that works. We started last time with a small number ε and W as a subset of the tangent bundle, so it included points close to p (in some neighborhood V about p) paired with vectors with speed less than ε. Then we'll define a function F that gives us both our starting and ending points, so F: A - > M x M by F(q,v) = (q, expqv).”

“All right, that's a start. Now A is part of some local tangent bundle, so there's some system of local coordinates U where A is in TU and V is in **x** (U).”

“Right, I see what you mean,” said Arthur. “In Flatland, I could mark points with local coordinates, and  A would be those points paired with velocity vectors.”

“Yes, but next we can deal with the case where there is no velocity at all.”

“You mean a stationary point?”

“I mean what happens if the point is not pushed. Then we have F(p,0) = (p,p), a point in M x M. Well, then, we have a new  smooth  manifold with a new system of coordinates (U x U;  **x, x** ). But then what is its differential  at (p,0) ?”

“I wasn't expecting this,” said Arthur, as he contemplated the new problem. “ Let's see... The differential is a linear approximation for values close to (p,0). Well, for small magnitude vectors, the exponential map is approximately the identity, since traveling along the tangent vector will be approximately the same as traveling a geodesic for a short distance. So if we input (p,v), where v is a small vector, we should get a new vector of (p+v, v), since we add v to the position but keep the same velocity. That means the matrix will be ({I, I}, {0, I}), a block matrix with 0 in the bottom left, signifying that the new position is based on initial position and velocity, but the final velocity is based only on initial velocity.”

“I think you have it,” agreed Spherius. “And that matrix is undoubtedly invertible, so the inverse function theorem says F is a local diffeomorphism near (p,0).  So we can find A' in A that F maps diffeomorphically to W', a neighborhood of (p,p) in M x M.”

“Yes, and we can shrink A' so it includes only points near p with small velocities, that is, A' = {(q,v) | q  є  V' and v  є T q M, with |v| < δ }, where V' is some smaller neighborhood of p contained in V. And we can pick W in M so that W x W is contained in W'. And that should do it!”

“What do you mean?” asked Spherius.

“Well, F now takes A' to W' diffeomorphically.  So pick any q in W and B δ (0) contained in T q M. This is a subset of A', so it lands in W', which contains W x W. But F must take (q, v) to a pair of points (q, exp q v), so the second point is in W. Then {q} x W is contained in F({q}, B δ (0)), so W is contained in exp q B δ (0).”

“All right, that was quite technical, but let's go over the implications. We're now saying that if we start from any point in W, we can reach a second point by traveling along a geodesic. And combined with what we saw earlier, this can be summed up with the old Euclidean axiom that the shortest distance between two points is a straight line- or, rather, a geodesic.”

“Yes, but only locally,” said Arthur. “We have already seen that geodesics may not be unique over large neighborhoods.  But our proof shows we can always find some neighborhood on which that holds. In fact, on that neighborhood, there's a unique geodesic  γ of length < δ that joins them, and  γ depends differentiably on the starting and ending points.”

“Why would the length of the geodesic be small? Oh, wait, because the norm of a geodesic would be constant, so if we want to find its length, we integrate that constant from 0 to 1 and get the same value.”

“Right, but of course  δ  could be very large in some cases. In Euclidean space, it would be infinite since you can always follow a straight line between any two points no matter how distant. Then Euclidean space is better than normal, it's totally normal, and we'll call any such space a  _ totally normal neighborhood _ .”

“What was that you said about differentiable dependence?”

“Well, say we have two points (q 1 , q 2 ), then we can use F -1 to extract the vector v that defines the geodesic joining them. Since F -1 is a diffeomorphism, γ itself varies smoothly with changes in q 1  and q 2 .”

“ Ah, yes. And this additionally shows that all minimizing curves (among piecewise differentiable curves) are geodesics. We simply find an appropriate totally normal neighborhood of a curve and apply the appropriate theorems from earlier.”


	17. How They Stumbled Across the Curvature

“There’s a storm coming,” announced Arthur.

“I’m sorry?” asked Spherius.

“A storm. Big wind blowing from the west.”

“I see. That could be interesting.”

“Interesting how?” asked Arthur suspiciously.

“Oh, don’t misunderstand, Arthur, I don’t want you to endanger yourself. But if it’s safe, it might be worth investigating. A neighborhood of Flatland will have both the eastward push of the wind and the southern pull of gravity. It may be informative to see how they interact with one another, or even with a third force.”

“I suppose,” said Arthur, as he felt the raindrops begin to hit him. “We could take one of your motors and see how it is affected by the wind and gravity.”

“That’s a good start. We’ll call gravity X and the storm Y, and then the motor can be Z. Then what we have is a function that takes the vector field Z to a new vector field.”

“And the motor can be moved freely so that its positions are the domain of the function. Then we see X and Y will affect it, and the way they affect it will be determined by the Riemannian Connection.”

“Oh, right, the connection,” said Spherius. “So we have DelXZ for the effect of gravity but then that motion is changed by the storm, meaning DelYDelX(Z). And then we should subtract out the other direction, i.e., DelXDelY(Z) to get the net effect.”

“And that should do it,” said Arthur grandly. “The effect of the storm and the wind on your motor’s action is DelYDelX(Z) - DelXDelY(Z). But since Flatland is smooth, the partial derivatives can be interchanged, so the net effect is zero.”

“Not quite, Arthur. We have to account for the interplay of the wind and gravity, which should require the bracket.”

“I’m sorry, what? The bracket?”

“Remember, we said the instantaneous flow could be represented by the bracket XY-YX?”

“Yes, yes, I remember what the bracket is, but I don’t see why it’s relevant when there is no instantaneous relationship between the two forces.”

“How can you say that? Gravity affects the wind and the wind affects gravity in turn.”

“Of course it does, but we’ve already accounted for that,” said Arthur. “The bracket is about limiting properties of the interactions, but the interaction is constant.”

“It’s not constant, though!” shouted Spherius. “Southward motion is also eastward motion, isn’t that obvious?”

“I think the pressure is starting to get to you, Spherius. East and south are orthogonal! Gravity is (0, -1) and wind is (1, 0), give or take some scalar multiples.”

“They’re not, though! East is (1, 1/10, -2) and south is (-1/5, -1, 1).”

“Are you saying cardinal directions are not orthogonal?”

“Of course not, because they’re only cardinal from your perspective!” The words struck both men silent as they thought about the implications.

“Flatland appears flat,” said Arthur slowly. “To me it does, anyway. So the cardinal velocities ∂/∂x and ∂/∂y are orthogonal. Then [∂/∂x, ∂/∂y] = ∂/∂x ∂/∂y - ∂/∂y ∂/∂x = 0, since the manifold’s smoothness means we can interchange partial derivatives.”

“And this would be true in higher dimensions as well,” said Spherius. “[∂/∂xi, ∂/∂xj] = 0 for any pair of i and j when we’re living in Euclidean Space. So the bracket’s effect will be non-existent.”

“What’s more, the total expression will also be zero. DelYDelxZ = Dely(Xz1,…, Xzn), where we think of the z’s as real-valued functions and X as an operator that turns them into new functions. So Dely(Xz1,…, Xzn) = (YXz1,…, YXzn), and DelXDelYZ = (XYz1,…, XYzn), which are the same if X and Y are orthogonal as above. So we end up with DelYDelX(Z) - DelXDelY(Z) + Del[X,Y]Z = 0.”

“But this is only true in Rn, meaning in a curved space the expression may be non-zero. So we can use this to evaluate the curvature of a manifold. Let’s call it R, and we’ll say R(X,Y)Z = DelYDelX(Z) - DelXDelY(Z) + Del[X,Y]Z is the _curvature (R) of a manifold M_.”

“Another way of seeing that,” said Arthur, “is that if we have a coordinate system {xi} around p, we can apply the definition to them. [∂/∂xi, ∂/∂xj] = 0, as we saw. Then we can take any two of these directions and create a curvature operator to apply to a third one, so R(∂/∂xi, ∂/∂xj)∂/∂xk = Del∂/∂xj Del∂/∂xi(∂/∂xk) - Del∂/∂xi Del∂/∂xj(∂/∂xk).”

“And the magnitude of that expression depends on the curvature,” mused Spherius. “In R^n, the connection between cardinal directions is zero because they are all orthogonal. The north and east directions are perpendicular to you, but as a manifold they are not. So the more curved the space is, the less commutative the covariant derivative is.”

“This is a good start,” said Arthur. “I think we need to explore this in a formal manner. What are the purely mathematical properties of this operator?”

“Sounds like it’s time for some grunt work,” sighed Spherius. “Pages upon pages of tinkering with mathematical identities.”

“I know, isn’t it great?”

While Arthur may have enjoyed the explorations, we will spare the reader a detailed description of the tinkering. However, we will summarize the results. The first is that curvature is linear in two different ways. It is bilinear over X(M) x X(M), meaning R(fX1 \+ gX2+, Y1) = fR(X1, Y1) + g R(X2, Y1) and R(X1, fY1 \+ gY2) = fR(X1, Y1) + g R(X1, Y2). Note that the expressions in these equations are _operators_ , meaning that evaluating them at a vector field Z will yield the same new vector field. This equation took a great deal of computation to verify, but they were eventually convinced it was correct. Also, f and g are scalar functions, so expressions like fX1 refer to multiplication, not composition. In addition, if X and Y are fixed, then the operator R(X,Y) : X(M) -> X(M) is linear, so R(X,Y)(Z+W) = R(X,Y)(Z)+ R(X,Y)(W) and R(X,Y)(fZ) = f R(X,Y)Z.

What we know as the Bianchi Identity followed directly from the Jacobi Identity and the fact that the Riemannian Connection is symmetric (i.e., DelXY – DelYX = [X,Y]): R(X,Y)Z + R(Y,Z)X + R(Z,X)Y = 0.

The idea of applying inner products arose naturally, and they thought that taking the inner product of the curvature applied to one vector field with a second would be useful, so they coined the expression (X,Y,Z,T) = <R(X,Y)Z, T>. Several types of symmetry unfolded with respect to this definition:

A: (X, Y, Z, T) + (Y, Z, X, T) + (Z, X, Y, T) = 0

B: (X, Y, Z, T) = - (Y, X, Z, T)

C: (X, Y, Z, T) = -(X, Y, T, Z)

D: (X, Y, Z, T) = (Z, T, X, Y)

Spherius remembered these last three by envisioning the four fields in two blocks (X, Y) and (Z, T). Switching within a block would change the sign, but switching the blocks themselves would not. Another obvious step was to write the expression in local coordinates. Given ∂/∂xi = Xi as usual, so Xi is the image of the ith cardinal direction under the differential map, define R(Xi, Xj) Xk = Σl Rijkl Xl, meaning Rijkl is the lth component of the new vector field. Then < R(Xi, Xj) Xk, Xs> = Σl Rijkl gls = Rjkls


End file.
